Merge lp:~stub/charms/trusty/cassandra/spike into lp:charms/trusty/cassandra

Proposed by Stuart Bishop
Status: Merged
Merged at revision: 362
Proposed branch: lp:~stub/charms/trusty/cassandra/spike
Merge into: lp:charms/trusty/cassandra
Diff against target: 7203 lines (+2676/-3234)
41 files modified
Makefile (+10/-6)
README.md (+4/-11)
charm-helpers.yaml (+2/-3)
config.yaml (+1/-18)
hooks/actions.py (+380/-191)
hooks/charmhelpers/contrib/benchmark/__init__.py (+124/-0)
hooks/charmhelpers/contrib/charmsupport/nrpe.py (+3/-1)
hooks/charmhelpers/contrib/peerstorage/__init__.py (+0/-150)
hooks/charmhelpers/contrib/unison/__init__.py (+0/-298)
hooks/charmhelpers/coordinator.py (+607/-0)
hooks/charmhelpers/core/hookenv.py (+230/-14)
hooks/charmhelpers/core/host.py (+1/-1)
hooks/charmhelpers/core/services/base.py (+41/-17)
hooks/charmhelpers/core/strutils.py (+2/-2)
hooks/charmhelpers/fetch/giturl.py (+7/-5)
hooks/cluster-relation-changed (+20/-0)
hooks/cluster-relation-departed (+20/-0)
hooks/config-changed (+20/-0)
hooks/coordinator.py (+35/-0)
hooks/data-relation-changed (+20/-0)
hooks/data-relation-departed (+20/-0)
hooks/database-admin-relation-changed (+20/-0)
hooks/database-relation-changed (+20/-0)
hooks/definitions.py (+51/-62)
hooks/helpers.py (+194/-517)
hooks/hooks.py (+22/-7)
hooks/install (+20/-0)
hooks/leader-elected (+20/-0)
hooks/leader-settings-changed (+20/-0)
hooks/nrpe-external-master-relation-changed (+20/-0)
hooks/relations.py (+19/-3)
hooks/rollingrestart.py (+0/-257)
hooks/stop (+20/-0)
hooks/upgrade-charm (+20/-0)
templates/cassandra_maintenance_cron.tmpl (+1/-1)
tests/test_actions.py (+524/-515)
tests/test_definitions.py (+37/-21)
tests/test_helpers.py (+93/-589)
tests/test_integration.py (+27/-8)
tests/test_rollingrestart.py (+0/-536)
tests/tests.yaml (+1/-1)
To merge this branch: bzr merge lp:~stub/charms/trusty/cassandra/spike
Reviewer Review Type Date Requested Status
Tim Van Steenburgh (community) Approve
charmers Pending
Review via email: mp+262608@code.launchpad.net

Description of the change

Rework the Cassandra charm to rely on leadership, and maintain the unit and service status while we are at it.

By using leadership, we increase reliability and are able to remove a large amount of code that was needed to cope with flaky coordination.

To post a comment you must log in.
Revision history for this message
Stuart Bishop (stub) wrote :

The charmhelpers coordinator module has not landed and can be reviewed at https://code.launchpad.net/~stub/charm-helpers/coordinator/+merge/261267

Revision history for this message
Stuart Bishop (stub) wrote :

I've also taken the opportunity to update the default versions. With 2.1.7, the 2.1 series is finally the recommended version and DataStax has released 4.7 (now based on Cassandra 2.1).

Revision history for this message
Tim Van Steenburgh (tvansteenburgh) wrote :

Hey stub, there are some failures in the tests. You may already be aware, but I wanted to point it out b/c the test runner is, for reasons unknown to me at this point, exiting 0 in spite of the failures.

Check out the console logs here: http://reports.vapour.ws/charm-test-details/charm-bundle-test-parent-259

review: Needs Fixing
Revision history for this message
Stuart Bishop (stub) wrote :

On 26 June 2015 at 22:37, Tim Van Steenburgh <email address hidden> wrote:

> Check out the console logs here: http://reports.vapour.ws/charm-test-details/charm-bundle-test-parent-259

These seem to be for an earlier revision of the charm. In particular,
the Test21Deployment tests should be failing because that TestCase no
longer exists, yet they are being run. This broke with my work on the
23rd, so I've fixed the Makefile and tests.yaml and pushed.

--
Stuart Bishop <email address hidden>

Revision history for this message
Stuart Bishop (stub) wrote :

I have fixed the error codes bug, so failing tests will be reported as failed in Jenkins.

Revision history for this message
Stuart Bishop (stub) wrote :

Another run at http://reports.vapour.ws/charm-test-details/charm-bundle-test-parent-263

http://reports.vapour.ws/all-bundle-and-charm-results/charm-bundle-test-parent-263/charm/charm-testing-hp/3 shows test_external_mount failing at 11:07, when amulet fails to read the contents of a remote directory. However, http://data.vapour.ws/charm-test/charm-bundle-test-hp-120-all-machines-log clearly shows the directory being created at 11:04:34 (the migration starts at 11:04:29).

Revision history for this message
Stuart Bishop (stub) wrote :

http://reports.vapour.ws/charm-test-details/charm-bundle-test-parent-264 has a complete successful run on AWS. The other clouds are having problems with provisioning nodes.

Revision history for this message
Stuart Bishop (stub) wrote :

I've also opened Bug #1470024 on the sentry.directory_listing failure, which is breaking the external storage test on hpcloud (I added extra debugging, attempting to list the contents of the root directory before attempting the cassandra specific directory and demonstrating this is an Amulet/Juju/Cloud failure rather than a charm failure).

362. By Tim Van Steenburgh

[stub] Rework the Cassandra charm to rely on leadership, and maintain
the unit and service status while we are at it.

By using leadership, we increase reliability and are able to remove a
large amount of code that was needed to cope with flaky coordination.

Revision history for this message
Tim Van Steenburgh (tvansteenburgh) wrote :

LGTM, merging on the merit of tests passing on AWS. Other failures are clearly tooling-related, not problems in the charm. Thanks for filing the bug against amulet.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'Makefile'
2--- Makefile 2015-04-28 16:20:05 +0000
3+++ Makefile 2015-06-30 07:36:26 +0000
4@@ -34,6 +34,10 @@
5 PIP=.venv3/bin/pip3.4 -q
6 NOSETESTS=.venv3/bin/nosetests-3.4 -sv
7
8+# Set pipefail so we can get sane error codes while tagging test output
9+# with ts(1)
10+SHELL=bash -o pipefail
11+
12 deps: packages venv3
13
14 lint: deps
15@@ -41,16 +45,17 @@
16 free --human
17 charm proof $(CHARM_DIR)
18 flake8 \
19- --ignore=E402 \
20+ --ignore=E402,E265 \
21 --exclude=charmhelpers,.venv2,.venv3 hooks tests testing
22+ @echo OK: Lint free `date`
23
24 unittest: lint
25 $(NOSETESTS) \
26 tests.test_actions --cover-package=actions \
27 tests.test_helpers --cover-package=helpers \
28- tests.test_rollingrestart --cover-package=rollingrestart \
29 tests.test_definitions --cover-package=definitions \
30 --with-coverage --cover-branches
31+ @echo OK: Unit tests pass `date`
32
33 test: unittest
34 AMULET_TIMEOUT=3600 \
35@@ -62,11 +67,11 @@
36 AMULET_TIMEOUT=5400 \
37 $(NOSETESTS) tests.test_integration:Test1UnitDeployment 2>&1 | ts
38
39-21test: unittest Test21Deployment
40-Test21Deployment: deps
41+20test: unittest Test21Deployment
42+Test20Deployment: deps
43 date
44 AMULET_TIMEOUT=5400 \
45- $(NOSETESTS) tests.test_integration:Test21Deployment 2>&1 | ts
46+ $(NOSETESTS) tests.test_integration:Test20Deployment 2>&1 | ts
47
48 3test: unittest Test3UnitDeployment
49 Test3UnitDeployment: deps
50@@ -95,7 +100,6 @@
51 $(NOSETESTS) \
52 tests.test_actions --cover-package=actions \
53 tests.test_helpers --cover-package=helpers \
54- tests.test_rollingrestart --cover-package=rollingrestart \
55 tests.test_definitions --cover-package=definitions \
56 --with-coverage --cover-branches \
57 --cover-html --cover-html-dir=coverage \
58
59=== modified file 'README.md'
60--- README.md 2015-03-07 09:27:02 +0000
61+++ README.md 2015-06-30 07:36:26 +0000
62@@ -14,9 +14,9 @@
63 # Editions
64
65 This charm supports Apache Cassandra 2.0, Apache Cassandra 2.1, and
66-Datastax Enterprise 4.6. The default is Apache Cassandra 2.0.
67+Datastax Enterprise 4.7. The default is Apache Cassandra 2.1.
68
69-To use Apache Cassandra 2.1, specify the Apache Cassandra 2.1 archive source
70+To use Apache Cassandra 2.0, specify the Apache Cassandra 2.0 archive source
71 in the `install_sources` config setting when deploying.
72
73 To use Datastax Enterprise, set the `edition` config setting to `dse`
74@@ -83,23 +83,16 @@
75 `install_keys` configuration item. Place the DataStax packages in a
76 local archive to avoid downloading from datastax.com.
77
78-The Cassandra Python driver and some dependencies are installed using
79-`pip(1)`. Set the `http_proxy` config item to direct its downloads via
80-a proxy server, such as squid or devpi.
81-
82
83 ## Oracle Java SE
84
85-Cassandra recommends using Oracle Java SE 7. Unfortunately, this
86+Cassandra recommends using Oracle Java SE 8. Unfortunately, this
87 software is accessible only after accepting Oracle's click-through
88 license making deployments using it much more cumbersome. You will need
89-to download the Oracle Java SE 7 Server Runtime for Linux, and place the
90+to download the Oracle Java SE 8 Server Runtime for Linux, and place the
91 tarball at a URL accessible to your deployed units. The config item
92 `private_jre_url` needs to be set to this URL.
93
94-The recommended Java version is expected to change, as Java SE 7 public
95-releases cease as of April 2015.
96-
97
98 # Usage
99
100
101=== modified file 'charm-helpers.yaml'
102--- charm-helpers.yaml 2015-04-03 15:42:21 +0000
103+++ charm-helpers.yaml 2015-06-30 07:36:26 +0000
104@@ -18,11 +18,10 @@
105 #branch: lp:charm-helpers
106 branch: lp:~stub/charm-helpers/integration
107 include:
108- - __init__
109+ - coordinator
110 - core
111 - fetch
112 - contrib.charmsupport
113 - contrib.templating.jinja
114- - contrib.peerstorage
115 - contrib.network.ufw
116- - contrib.unison
117+ - contrib.benchmark
118
119=== modified file 'config.yaml'
120--- config.yaml 2015-03-07 09:27:02 +0000
121+++ config.yaml 2015-06-30 07:36:26 +0000
122@@ -35,7 +35,7 @@
123 If you are using Datastax Enterprise, you will need to
124 override one defaults with your own username and password.
125 default: |
126- - deb http://www.apache.org/dist/cassandra/debian 20x main
127+ - deb http://www.apache.org/dist/cassandra/debian 21x main
128 # - deb http://debian.datastax.com/community stable main
129 # DSE requires you to register and add your username/password here.
130 # - deb http://un:pw@debian.datastax.com/enterprise stable main
131@@ -207,15 +207,6 @@
132 default: 256
133 description: Number of tokens per node.
134
135- # Not implemented.
136- # force_seed_nodes:
137- # type: string
138- # default: ""
139- # description: >
140- # A comma separated list of seed node ip addresses. This is
141- # useful if the cluster being created in this juju environment
142- # is part of a larger cluser and the seed nodes are remote.
143-
144 # Topology of the service in the cluster.
145 datacenter:
146 type: string
147@@ -312,11 +303,3 @@
148 performance problems and even exaust the server heap. Adjust
149 the thresholds here if you understand the dangers and want
150 to scan more tombstones anyway.
151-
152- post_bootstrap_delay:
153- type: int
154- default: 120
155- description: >
156- Testing option - do not change. When adding multiple new
157- nodes to a cluster, you are supposed to wait two minutes
158- between nodes. I set this to zero to speed up test runs.
159
160=== modified file 'hooks/actions.py'
161--- hooks/actions.py 2015-04-12 16:20:50 +0000
162+++ hooks/actions.py 2015-06-30 07:36:26 +0000
163@@ -23,20 +23,22 @@
164 import shlex
165 import subprocess
166 from textwrap import dedent
167+import time
168 import urllib.request
169
170 from charmhelpers import fetch
171 from charmhelpers.contrib.charmsupport import nrpe
172 from charmhelpers.contrib.network import ufw
173 from charmhelpers.contrib.templating import jinja
174-from charmhelpers.contrib import unison
175 from charmhelpers.core import hookenv, host
176 from charmhelpers.core.fstab import Fstab
177-from charmhelpers.core.hookenv import ERROR, WARNING
178-
179+from charmhelpers.core.hookenv import DEBUG, ERROR, WARNING
180+
181+import cassandra
182+
183+from coordinator import coordinator
184 import helpers
185 import relations
186-import rollingrestart
187
188
189 # These config keys cannot be changed after service deployment.
190@@ -54,7 +56,6 @@
191 'native_transport_port',
192 'partitioner',
193 'num_tokens',
194- 'force_seed_nodes',
195 'max_heap_size',
196 'heap_newsize',
197 'authorizer',
198@@ -83,8 +84,7 @@
199 'nagios_heapchk_warn_pct',
200 'nagios_heapchk_crit_pct',
201 'nagios_disk_warn_pct',
202- 'nagios_disk_crit_pct',
203- 'post_bootstrap_delay'])
204+ 'nagios_disk_crit_pct'])
205
206
207 def action(func):
208@@ -103,6 +103,17 @@
209 return wrapper
210
211
212+def leader_only(func):
213+ '''Decorated function is only run on the leader.'''
214+ @wraps(func)
215+ def wrapper(*args, **kw):
216+ if hookenv.is_leader():
217+ return func(*args, **kw)
218+ else:
219+ return None
220+ return wrapper
221+
222+
223 @action
224 def set_proxy():
225 config = hookenv.config()
226@@ -113,13 +124,14 @@
227
228 @action
229 def revert_unchangeable_config():
230- if hookenv.hook_name() == 'install':
231- # config.previous() only becomes meaningful after the install
232- # hook has run. During the first run on the unit hook, it
233- # reports everything has having None as the previous value.
234+ config = hookenv.config()
235+
236+ # config.previous() only becomes meaningful after the install
237+ # hook has run. During the first run on the unit hook, it
238+ # reports everything has having None as the previous value.
239+ if config._prev_dict is None:
240 return
241
242- config = hookenv.config()
243 for key in UNCHANGEABLE_KEYS:
244 if config.changed(key):
245 previous = config.previous(key)
246@@ -231,48 +243,56 @@
247 helpers.ensure_package_status(helpers.get_cassandra_packages())
248
249
250-@action
251-def install_oracle_jre():
252- if helpers.get_jre() != 'oracle':
253- return
254-
255+def _fetch_oracle_jre():
256 config = hookenv.config()
257 url = config.get('private_jre_url', None)
258- if not url:
259- hookenv.log('private_jre_url not set. Unable to continue.', ERROR)
260- raise SystemExit(1)
261-
262- if config.get('retrieved_jre', None) != url:
263- filename = os.path.join('lib', url.split('/')[-1])
264+ if url and config.get('retrieved_jre', None) != url:
265+ filename = os.path.join(hookenv.charm_dir(),
266+ 'lib', url.split('/')[-1])
267 if not filename.endswith('-linux-x64.tar.gz'):
268- hookenv.log('Invalid JRE URL {}'.format(url), ERROR)
269- raise SystemExit(1)
270+ helpers.status_set('blocked',
271+ 'Invalid private_jre_url {}'.format(url))
272+ raise SystemExit(0)
273+ helpers.status_set(hookenv.status_get(),
274+ 'Downloading Oracle JRE')
275+ hookenv.log('Oracle JRE URL is {}'.format(url))
276 urllib.request.urlretrieve(url, filename)
277 config['retrieved_jre'] = url
278
279- pattern = 'lib/server-jre-?u*-linux-x64.tar.gz'
280+ pattern = os.path.join(hookenv.charm_dir(),
281+ 'lib', 'server-jre-?u*-linux-x64.tar.gz')
282 tarballs = glob.glob(pattern)
283- if not tarballs:
284- hookenv.log('Oracle JRE tarball not found ({})'.format(pattern),
285- ERROR)
286- # We could fallback to OpenJDK, but the user took the trouble
287- # to specify the Oracle JRE and it is recommended for Cassandra
288- # so lets hard fail instead.
289- raise SystemExit(1)
290+ if not (url or tarballs):
291+ helpers.status_set('blocked',
292+ 'private_jre_url not set and no local tarballs.')
293+ raise SystemExit(0)
294+
295+ elif not tarballs:
296+ helpers.status_set('blocked',
297+ 'Oracle JRE tarball not found ({})'.format(pattern))
298+ raise SystemExit(0)
299
300 # Latest tarball by filename/version num. Lets hope they don't hit
301 # 99 (currently at 76).
302 tarball = sorted(tarballs)[-1]
303-
304+ return tarball
305+
306+
307+def _install_oracle_jre_tarball(tarball):
308 # Same directory as webupd8 to avoid surprising people, but it could
309 # be anything.
310- dest = '/usr/lib/jvm/java-7-oracle'
311+ if 'jre-7u' in str(tarball):
312+ dest = '/usr/lib/jvm/java-7-oracle'
313+ else:
314+ dest = '/usr/lib/jvm/java-8-oracle'
315
316 if not os.path.isdir(dest):
317 host.mkdir(dest)
318
319 jre_exists = os.path.exists(os.path.join(dest, 'bin', 'java'))
320
321+ config = hookenv.config()
322+
323 # Unpack the latest tarball if necessary.
324 if config.get('oracle_jre_tarball', '') == tarball and jre_exists:
325 hookenv.log('Already installed {}'.format(tarball))
326@@ -293,6 +313,15 @@
327
328
329 @action
330+def install_oracle_jre():
331+ if helpers.get_jre() != 'oracle':
332+ return
333+
334+ tarball = _fetch_oracle_jre()
335+ _install_oracle_jre_tarball(tarball)
336+
337+
338+@action
339 def emit_java_version():
340 # Log the version for posterity. Could be useful since Oracle JRE
341 # security updates are not automated.
342@@ -355,82 +384,153 @@
343 host.write_file(rackdc_path, rackdc_properties.encode('UTF-8'))
344
345
346+def needs_reset_auth_keyspace_replication():
347+ '''Guard for reset_auth_keyspace_replication.'''
348+ num_nodes = helpers.num_nodes()
349+ n = min(num_nodes, 3)
350+ datacenter = hookenv.config()['datacenter']
351+ with helpers.connect() as session:
352+ strategy_opts = helpers.get_auth_keyspace_replication(session)
353+ rf = int(strategy_opts.get(datacenter, -1))
354+ hookenv.log('system_auth rf={!r}'.format(strategy_opts))
355+ # If the node count has increased, we will change the rf.
356+ # If the node count is decreasing, we do nothing as the service
357+ # may be being destroyed.
358+ if rf < n:
359+ return True
360+ return False
361+
362+
363+@leader_only
364 @action
365+@coordinator.require('repair', needs_reset_auth_keyspace_replication)
366 def reset_auth_keyspace_replication():
367- # This action only lowers the system_auth keyspace replication
368- # values when a node has been decommissioned. The replication settings
369- # are also updated during rolling restart, which takes care of when new
370- # nodes are added.
371- helpers.reset_auth_keyspace_replication()
372+ # Cassandra requires you to manually set the replication factor of
373+ # the system_auth keyspace, to ensure availability and redundancy.
374+ # We replication factor in this service's DC can be no higher than
375+ # the number of bootstrapped nodes. We also cap this at 3 to ensure
376+ # we don't have too many seeds.
377+ num_nodes = helpers.num_nodes()
378+ n = min(num_nodes, 3)
379+ datacenter = hookenv.config()['datacenter']
380+ with helpers.connect() as session:
381+ strategy_opts = helpers.get_auth_keyspace_replication(session)
382+ rf = int(strategy_opts.get(datacenter, -1))
383+ hookenv.log('system_auth rf={!r}'.format(strategy_opts))
384+ if rf != n:
385+ strategy_opts['class'] = 'NetworkTopologyStrategy'
386+ strategy_opts[datacenter] = n
387+ if 'replication_factor' in strategy_opts:
388+ del strategy_opts['replication_factor']
389+ helpers.set_auth_keyspace_replication(session, strategy_opts)
390+ helpers.repair_auth_keyspace()
391+ helpers.set_active()
392
393
394 @action
395 def store_unit_private_ip():
396+ '''Store the unit's private ip address, so we can tell if it changes.'''
397 hookenv.config()['unit_private_ip'] = hookenv.unit_private_ip()
398
399
400-@action
401-def set_unit_zero_bootstrapped():
402- '''Unit #0 is used as the first node in the cluster.
403-
404- Unit #0 is implicitly flagged as bootstrap, and thus befores the
405- first node in the cluster and providing a seed for other nodes to
406- bootstrap off. We can change this when we have juju leadership,
407- making the leader the first node in the cluster. Until then, don't
408- attempt to create a multiunit service if you have removed Unit #0.
409- '''
410- relname = rollingrestart.get_peer_relation_name()
411- if helpers.unit_number() == 0 and hookenv.hook_name().startswith(relname):
412- helpers.set_bootstrapped(True)
413-
414-
415-@action
416-def maybe_schedule_restart():
417- '''Prepare for and schedule a rolling restart if necessary.'''
418+def needs_restart():
419+ '''Return True if Cassandra is not running or needs to be restarted.'''
420+ if helpers.is_decommissioned():
421+ # Decommissioned nodes are never restarted. They remain up
422+ # telling everyone they are decommissioned.
423+ helpers.status_set('blocked', 'Decommissioned node')
424+ return False
425+
426 if not helpers.is_cassandra_running():
427- # Short circuit if Cassandra is not running to avoid log spam.
428- rollingrestart.request_restart()
429- return
430-
431- if helpers.is_decommissioned():
432- hookenv.log("Decommissioned")
433- return
434-
435- # If any of these config items changed, a restart is required.
436+ if helpers.is_bootstrapped():
437+ helpers.status_set('waiting', 'Waiting for permission to start')
438+ else:
439+ helpers.status_set('waiting',
440+ 'Waiting for permission to bootstrap')
441+ return True
442+
443 config = hookenv.config()
444- restart = False
445- for key in RESTART_REQUIRED_KEYS:
446- if config.changed(key):
447- hookenv.log('{} changed. Restart required.'.format(key))
448- restart = True
449+
450+ # If our IP address has changed, we need to restart.
451+ if config.changed('unit_private_ip'):
452+ hookenv.log('waiting',
453+ 'IP address changed. Waiting for restart permission.')
454+ return True
455
456 # If the directory paths have changed, we need to migrate data
457- # during a restart. Directory config items have already been picked
458- # up in the previous check.
459+ # during a restart.
460 storage = relations.StorageRelation()
461 if storage.needs_remount():
462- hookenv.log('Mountpoint changed. Restart and migration required.')
463- restart = True
464+ helpers.status_set(hookenv.status_get(),
465+ 'New mounts. Waiting for restart permission')
466+ return True
467
468- # If our IP address has changed, we need to restart.
469- if config.changed('unit_private_ip'):
470- hookenv.log('Unit IP address changed. Restart required.')
471- restart = True
472+ # If any of these config items changed, a restart is required.
473+ for key in RESTART_REQUIRED_KEYS:
474+ if config.changed(key):
475+ hookenv.log('{} changed. Restart required.'.format(key))
476+ for key in RESTART_REQUIRED_KEYS:
477+ if config.changed(key):
478+ helpers.status_set(hookenv.status_get(),
479+ 'Config changes. '
480+ 'Waiting for restart permission.')
481+ return True
482
483 # If we have new seeds, we should restart.
484- new_seeds = helpers.seed_ips()
485- config['configured_seeds'] = sorted(new_seeds)
486- if config.changed('configured_seeds'):
487+ new_seeds = helpers.get_seed_ips()
488+ if config.get('configured_seeds') != sorted(new_seeds):
489 old_seeds = set(config.previous('configured_seeds') or [])
490 changed = old_seeds.symmetric_difference(new_seeds)
491 # We don't care about the local node in the changes.
492 changed.discard(hookenv.unit_private_ip())
493 if changed:
494- hookenv.log('New seeds {!r}. Restart required.'.format(new_seeds))
495- restart = True
496-
497- if restart:
498- rollingrestart.request_restart()
499+ helpers.status_set(hookenv.status_get(),
500+ 'Updated seeds {!r}. '
501+ 'Waiting for restart permission.'
502+ ''.format(new_seeds))
503+ return True
504+
505+ hookenv.log('Restart not required')
506+ return False
507+
508+
509+@action
510+@coordinator.require('restart', needs_restart)
511+def maybe_restart():
512+ '''Restart sequence.
513+
514+ If a restart is needed, shutdown Cassandra, perform all pending operations
515+ that cannot be be done while Cassandra is live, and restart it.
516+ '''
517+ helpers.status_set('maintenance', 'Stopping Cassandra')
518+ helpers.stop_cassandra()
519+ helpers.remount_cassandra()
520+ helpers.ensure_database_directories()
521+ if helpers.peer_relid() and not helpers.is_bootstrapped():
522+ helpers.status_set('maintenance', 'Bootstrapping')
523+ else:
524+ helpers.status_set('maintenance', 'Starting Cassandra')
525+ helpers.start_cassandra()
526+
527+
528+@action
529+def post_bootstrap():
530+ '''Maintain state on if the node has bootstrapped into the cluster.
531+
532+ Per documented procedure for adding new units to a cluster, wait 2
533+ minutes if the unit has just bootstrapped to ensure other units
534+ do not attempt bootstrap too soon.
535+ '''
536+ if not helpers.is_bootstrapped():
537+ if coordinator.relid is not None:
538+ helpers.status_set('maintenance', 'Post-bootstrap 2 minute delay')
539+ hookenv.log('Post-bootstrap 2 minute delay')
540+ time.sleep(120) # Must wait 2 minutes between bootstrapping nodes.
541+
542+ # Unconditionally call this to publish the bootstrapped flag to
543+ # the peer relation, as the first unit was bootstrapped before
544+ # the peer relation existed.
545+ helpers.set_bootstrapped()
546
547
548 @action
549@@ -443,66 +543,90 @@
550 helpers.start_cassandra()
551
552
553+@leader_only
554 @action
555-def ensure_unit_superuser():
556- helpers.ensure_unit_superuser()
557+def create_unit_superusers():
558+ # The leader creates and updates accounts for nodes, using the
559+ # encrypted password they provide in relations.PeerRelation. We
560+ # don't end up with unencrypted passwords leaving the unit, and we
561+ # don't need to restart Cassandra in no-auth mode which is slow and
562+ # I worry may cause issues interrupting the bootstrap.
563+ if not coordinator.relid:
564+ return # No peer relation, no requests yet.
565+
566+ created_units = helpers.get_unit_superusers()
567+ uncreated_units = [u for u in hookenv.related_units(coordinator.relid)
568+ if u not in created_units]
569+ for peer in uncreated_units:
570+ rel = hookenv.relation_get(unit=peer, rid=coordinator.relid)
571+ username = rel.get('username')
572+ pwhash = rel.get('pwhash')
573+ if not username:
574+ continue
575+ hookenv.log('Creating {} account for {}'.format(username, peer))
576+ with helpers.connect() as session:
577+ helpers.ensure_user(session, username, pwhash, superuser=True)
578+ created_units.add(peer)
579+ helpers.set_unit_superusers(created_units)
580
581
582 @action
583 def reset_all_io_schedulers():
584- helpers.reset_all_io_schedulers()
585+ dirs = helpers.get_all_database_directories()
586+ dirs = (dirs['data_file_directories'] + [dirs['commitlog_directory']] +
587+ [dirs['saved_caches_directory']])
588+ config = hookenv.config()
589+ for d in dirs:
590+ if os.path.isdir(d): # Directory may not exist yet.
591+ helpers.set_io_scheduler(config['io_scheduler'], d)
592+
593+
594+def _client_credentials(relid):
595+ '''Return the client credentials used by relation relid.'''
596+ relinfo = hookenv.relation_get(unit=hookenv.local_unit(), rid=relid)
597+ username = relinfo.get('username')
598+ password = relinfo.get('password')
599+ if username is None or password is None:
600+ for unit in hookenv.related_units(coordinator.relid):
601+ try:
602+ relinfo = hookenv.relation_get(unit=unit, rid=relid)
603+ username = relinfo.get('username')
604+ password = relinfo.get('password')
605+ if username is not None and password is not None:
606+ return username, password
607+ except subprocess.CalledProcessError:
608+ pass # Assume the remote unit has not joined yet.
609+ return None, None
610+ else:
611+ return username, password
612
613
614 def _publish_database_relation(relid, superuser):
615- # Due to Bug #1409763, this functionality is as action rather than a
616- # provided_data item.
617- #
618 # The Casandra service needs to provide a common set of credentials
619- # to a client unit. Juju does not yet provide a leader so we
620- # need another mechanism for determine which unit will create the
621- # client's account, with the remaining units copying the lead unit's
622- # credentials. For the purposes of this charm, the first unit in
623- # order is considered the leader and creates the user with a random
624- # password. It then tickles the peer relation to ensure the other
625- # units get a hook fired and the opportunity to copy and publish
626- # these credentials. If the lowest numbered unit is removed before
627- # all of the other peers have copied its credentials, then the next
628- # lowest will have either already copied the credentials (and the
629- # remaining peers will use them), or the process starts again and
630- # it will generate new credentials.
631- node_list = list(rollingrestart.get_peers()) + [hookenv.local_unit()]
632- sorted_nodes = sorted(node_list, key=lambda unit: int(unit.split('/')[-1]))
633- first_node = sorted_nodes[0]
634-
635- config = hookenv.config()
636-
637- try:
638- relinfo = hookenv.relation_get(unit=first_node, rid=relid)
639- except subprocess.CalledProcessError:
640- if first_node == hookenv.local_unit():
641- raise
642- # relation-get may fail if the specified unit has not yet joined
643- # the peer relation, or has just departed. Try again later.
644- return
645-
646- username = relinfo.get('username')
647- password = relinfo.get('password')
648- if hookenv.local_unit() == first_node:
649- # Lowest numbered unit, at least for now.
650- if 'username' not in relinfo:
651- # Credentials unset. Generate them.
652- username = 'juju_{}'.format(
653- relid.replace(':', '_').replace('-', '_'))
654+ # to a client unit. The leader creates these, if none of the other
655+ # units are found to have published them already (a previously elected
656+ # leader may have done this). The leader then tickles the other units,
657+ # firing a hook and giving them the opportunity to copy and publish
658+ # these credentials.
659+ username, password = _client_credentials(relid)
660+ if username is None:
661+ if hookenv.is_leader():
662+ # Credentials not set. The leader must generate them. We use
663+ # the service name so that database permissions remain valid
664+ # even after the relation is dropped and recreated, or the
665+ # juju environment rebuild and the database restored from
666+ # backups.
667+ username = 'juju_{}'.format(helpers.get_service_name(relid))
668+ if superuser:
669+ username += '_admin'
670 password = host.pwgen()
671- # Wake the other peers, if any.
672- hookenv.relation_set(rollingrestart.get_peer_relation_id(),
673- ping=rollingrestart.utcnow_str())
674- # Create the account if necessary, and reset the password.
675- # We need to reset the password as another unit may have
676- # rudely changed it thinking they were the lowest numbered
677- # unit. Fix this behavior once juju provides real
678- # leadership.
679- helpers.ensure_user(username, password, superuser)
680+ pwhash = helpers.encrypt_password(password)
681+ with helpers.connect() as session:
682+ helpers.ensure_user(session, username, pwhash, superuser)
683+ # Wake the peers, if any.
684+ helpers.leader_ping()
685+ else:
686+ return # No credentials yet. Nothing to do.
687
688 # Publish the information the client needs on the relation where
689 # they can find it.
690@@ -511,6 +635,7 @@
691 # - cluster_name, so clients can differentiate multiple clusters
692 # - datacenter + rack, so clients know what names they can use
693 # when altering keyspace replication settings.
694+ config = hookenv.config()
695 hookenv.relation_set(relid,
696 username=username, password=password,
697 host=hookenv.unit_public_ip(),
698@@ -546,53 +671,13 @@
699
700
701 @action
702-def emit_describe_cluster():
703- '''Spam 'nodetool describecluster' into the logs.'''
704+def emit_cluster_info():
705 helpers.emit_describe_cluster()
706-
707-
708-@action
709-def emit_auth_keyspace_status():
710- '''Spam 'nodetool status system_auth' into the logs.'''
711 helpers.emit_auth_keyspace_status()
712-
713-
714-@action
715-def emit_netstats():
716- '''Spam 'nodetool netstats' into the logs.'''
717 helpers.emit_netstats()
718
719
720 @action
721-def shutdown_before_joining_peers():
722- '''Shutdown the node before opening firewall ports for our peers.
723-
724- When the peer relation is first joined, the node has already been
725- setup and is running as a standalone cluster. This is problematic
726- when the peer relation has been formed, as if it is left running
727- peers may conect to it and initiate replication before this node
728- has been properly reset and bootstrapped. To avoid this, we
729- shutdown the node before opening firewall ports to any peers. The
730- firewall can then be opened, and peers will not find a node here
731- until it starts its bootstrapping.
732- '''
733- relname = rollingrestart.get_peer_relation_name()
734- if hookenv.hook_name() == '{}-relation-joined'.format(relname):
735- if not helpers.is_bootstrapped():
736- helpers.stop_cassandra(immediate=True)
737-
738-
739-@action
740-def grant_ssh_access():
741- '''Grant SSH access to run nodetool on remote nodes.
742-
743- This is easier than setting up remote JMX access, and more secure.
744- '''
745- unison.ssh_authorized_peers(rollingrestart.get_peer_relation_name(),
746- 'juju_ssh', ensure_local_user=True)
747-
748-
749-@action
750 def configure_firewall():
751 '''Configure firewall rules using ufw.
752
753@@ -632,13 +717,9 @@
754
755 # Rules for peers
756 for relinfo in hookenv.relations_of_type('cluster'):
757- for port in peer_ports:
758- desired_rules.add((relinfo['private-address'], 'any', port))
759-
760- # External seeds also need access.
761- for seed_ip in helpers.seed_ips():
762- for port in peer_ports:
763- desired_rules.add((seed_ip, 'any', port))
764+ if relinfo['private-address']:
765+ for port in peer_ports:
766+ desired_rules.add((relinfo['private-address'], 'any', port))
767
768 previous_rules = set(tuple(rule) for rule in config.get('ufw_rules', []))
769
770@@ -683,8 +764,7 @@
771 description="Check Cassandra Heap",
772 check_cmd="check_cassandra_heap.sh {} {} {}"
773 "".format(hookenv.unit_private_ip(), cassandra_heap_warn,
774- cassandra_heap_crit)
775- )
776+ cassandra_heap_crit))
777
778 cassandra_disk_warn = conf.get('nagios_disk_warn_pct')
779 cassandra_disk_crit = conf.get('nagios_disk_crit_pct')
780@@ -699,7 +779,116 @@
781 description="Check Cassandra Disk {}".format(disk),
782 check_cmd="check_disk -u GB -w {}% -c {}% -K 5% -p {}"
783 "".format(cassandra_disk_warn, cassandra_disk_crit,
784- disk)
785- )
786-
787+ disk))
788 nrpe_compat.write()
789+
790+
791+@leader_only
792+@action
793+def maintain_seeds():
794+ '''The leader needs to maintain the list of seed nodes'''
795+ seed_ips = helpers.get_seed_ips()
796+ hookenv.log('Current seeds == {!r}'.format(seed_ips), DEBUG)
797+
798+ bootstrapped_ips = helpers.get_bootstrapped_ips()
799+ hookenv.log('Bootstrapped == {!r}'.format(bootstrapped_ips), DEBUG)
800+
801+ # Remove any seeds that are no longer bootstrapped, such as dropped
802+ # units.
803+ seed_ips.intersection_update(bootstrapped_ips)
804+
805+ # Add more bootstrapped nodes, if necessary, to get to our maximum
806+ # of 3 seeds.
807+ potential_seed_ips = list(reversed(sorted(bootstrapped_ips)))
808+ while len(seed_ips) < 3 and potential_seed_ips:
809+ seed_ips.add(potential_seed_ips.pop())
810+
811+ # If there are no seeds or bootstrapped nodes, start with the leader. Us.
812+ if len(seed_ips) == 0:
813+ seed_ips.add(hookenv.unit_private_ip())
814+
815+ hookenv.log('Updated seeds == {!r}'.format(seed_ips), DEBUG)
816+
817+ hookenv.leader_set(seeds=','.join(sorted(seed_ips)))
818+
819+
820+@leader_only
821+@action
822+def reset_default_password():
823+ if hookenv.leader_get('default_admin_password_changed'):
824+ hookenv.log('Default admin password already changed')
825+ return
826+
827+ # Cassandra ships with well known credentials, rather than
828+ # providing a tool to reset credentials. This is a huge security
829+ # hole we must close.
830+ try:
831+ # We need a big timeout here, as the cassandra user actually
832+ # springs into existence some time after Cassandra has started
833+ # up and is accepting connections.
834+ with helpers.connect('cassandra', 'cassandra',
835+ timeout=120, auth_timeout=120) as session:
836+ # But before we close this security hole, we need to use these
837+ # credentials to create a different admin account for the
838+ # leader, allowing it to create accounts for other nodes as they
839+ # join. The alternative is restarting Cassandra without
840+ # authentication, which this charm will likely need to do in the
841+ # future when we allow Cassandra services to be related together.
842+ helpers.status_set('maintenance',
843+ 'Creating initial superuser account')
844+ username, password = helpers.superuser_credentials()
845+ pwhash = helpers.encrypt_password(password)
846+ helpers.ensure_user(session, username, pwhash, superuser=True)
847+ helpers.set_unit_superusers([hookenv.local_unit()])
848+
849+ helpers.status_set('maintenance',
850+ 'Changing default admin password')
851+ helpers.query(session, 'ALTER USER cassandra WITH PASSWORD %s',
852+ cassandra.ConsistencyLevel.ALL, (host.pwgen(),))
853+ except cassandra.AuthenticationFailed:
854+ hookenv.log('Default superuser account already reset')
855+ try:
856+ with helpers.connect():
857+ hookenv.log("Leader's superuser account already created")
858+ except cassandra.AuthenticationFailed:
859+ # We have no known superuser credentials. Create the account
860+ # the hard, slow way. This will be the normal method
861+ # of creating the service's initial account when we allow
862+ # services to be related together.
863+ helpers.create_unit_superuser_hard()
864+
865+ hookenv.leader_set(default_admin_password_changed=True)
866+
867+
868+@action
869+def set_active():
870+ # If we got this far, the unit is active. Update the status if it is
871+ # not already active. We don't do this unconditionally, as the charm
872+ # may be active but doing stuff, like active but waiting for restart
873+ # permission.
874+ if hookenv.status_get() != 'active':
875+ helpers.set_active()
876+ else:
877+ hookenv.log('Unit status already active', DEBUG)
878+
879+
880+@action
881+def request_unit_superuser():
882+ relid = helpers.peer_relid()
883+ if relid is None:
884+ hookenv.log('Request deferred until peer relation exists')
885+ return
886+
887+ relinfo = hookenv.relation_get(unit=hookenv.local_unit(),
888+ rid=relid)
889+ if relinfo and relinfo.get('username'):
890+ # We must avoid blindly setting the pwhash on the relation,
891+ # as we will likely get a different value everytime we
892+ # encrypt the password due to the random salt.
893+ hookenv.log('Superuser account request previously made')
894+ else:
895+ # Publish the requested superuser and hash to our peers.
896+ username, password = helpers.superuser_credentials()
897+ pwhash = helpers.encrypt_password(password)
898+ hookenv.relation_set(relid, username=username, pwhash=pwhash)
899+ hookenv.log('Requested superuser account creation')
900
901=== added directory 'hooks/charmhelpers/contrib/benchmark'
902=== added file 'hooks/charmhelpers/contrib/benchmark/__init__.py'
903--- hooks/charmhelpers/contrib/benchmark/__init__.py 1970-01-01 00:00:00 +0000
904+++ hooks/charmhelpers/contrib/benchmark/__init__.py 2015-06-30 07:36:26 +0000
905@@ -0,0 +1,124 @@
906+# Copyright 2014-2015 Canonical Limited.
907+#
908+# This file is part of charm-helpers.
909+#
910+# charm-helpers is free software: you can redistribute it and/or modify
911+# it under the terms of the GNU Lesser General Public License version 3 as
912+# published by the Free Software Foundation.
913+#
914+# charm-helpers is distributed in the hope that it will be useful,
915+# but WITHOUT ANY WARRANTY; without even the implied warranty of
916+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
917+# GNU Lesser General Public License for more details.
918+#
919+# You should have received a copy of the GNU Lesser General Public License
920+# along with charm-helpers. If not, see <http://www.gnu.org/licenses/>.
921+
922+import subprocess
923+import time
924+import os
925+from distutils.spawn import find_executable
926+
927+from charmhelpers.core.hookenv import (
928+ in_relation_hook,
929+ relation_ids,
930+ relation_set,
931+ relation_get,
932+)
933+
934+
935+def action_set(key, val):
936+ if find_executable('action-set'):
937+ action_cmd = ['action-set']
938+
939+ if isinstance(val, dict):
940+ for k, v in iter(val.items()):
941+ action_set('%s.%s' % (key, k), v)
942+ return True
943+
944+ action_cmd.append('%s=%s' % (key, val))
945+ subprocess.check_call(action_cmd)
946+ return True
947+ return False
948+
949+
950+class Benchmark():
951+ """
952+ Helper class for the `benchmark` interface.
953+
954+ :param list actions: Define the actions that are also benchmarks
955+
956+ From inside the benchmark-relation-changed hook, you would
957+ Benchmark(['memory', 'cpu', 'disk', 'smoke', 'custom'])
958+
959+ Examples:
960+
961+ siege = Benchmark(['siege'])
962+ siege.start()
963+ [... run siege ...]
964+ # The higher the score, the better the benchmark
965+ siege.set_composite_score(16.70, 'trans/sec', 'desc')
966+ siege.finish()
967+
968+
969+ """
970+
971+ required_keys = [
972+ 'hostname',
973+ 'port',
974+ 'graphite_port',
975+ 'graphite_endpoint',
976+ 'api_port'
977+ ]
978+
979+ def __init__(self, benchmarks=None):
980+ if in_relation_hook():
981+ if benchmarks is not None:
982+ for rid in sorted(relation_ids('benchmark')):
983+ relation_set(relation_id=rid, relation_settings={
984+ 'benchmarks': ",".join(benchmarks)
985+ })
986+
987+ # Check the relation data
988+ config = {}
989+ for key in self.required_keys:
990+ val = relation_get(key)
991+ if val is not None:
992+ config[key] = val
993+ else:
994+ # We don't have all of the required keys
995+ config = {}
996+ break
997+
998+ if len(config):
999+ with open('/etc/benchmark.conf', 'w') as f:
1000+ for key, val in iter(config.items()):
1001+ f.write("%s=%s\n" % (key, val))
1002+
1003+ @staticmethod
1004+ def start():
1005+ action_set('meta.start', time.strftime('%Y-%m-%dT%H:%M:%SZ'))
1006+
1007+ """
1008+ If the collectd charm is also installed, tell it to send a snapshot
1009+ of the current profile data.
1010+ """
1011+ COLLECT_PROFILE_DATA = '/usr/local/bin/collect-profile-data'
1012+ if os.path.exists(COLLECT_PROFILE_DATA):
1013+ subprocess.check_output([COLLECT_PROFILE_DATA])
1014+
1015+ @staticmethod
1016+ def finish():
1017+ action_set('meta.stop', time.strftime('%Y-%m-%dT%H:%M:%SZ'))
1018+
1019+ @staticmethod
1020+ def set_composite_score(value, units, direction='asc'):
1021+ """
1022+ Set the composite score for a benchmark run. This is a single number
1023+ representative of the benchmark results. This could be the most
1024+ important metric, or an amalgamation of metric scores.
1025+ """
1026+ return action_set(
1027+ "meta.composite",
1028+ {'value': value, 'units': units, 'direction': direction}
1029+ )
1030
1031=== modified file 'hooks/charmhelpers/contrib/charmsupport/nrpe.py'
1032--- hooks/charmhelpers/contrib/charmsupport/nrpe.py 2015-04-03 15:42:21 +0000
1033+++ hooks/charmhelpers/contrib/charmsupport/nrpe.py 2015-06-30 07:36:26 +0000
1034@@ -247,7 +247,9 @@
1035
1036 service('restart', 'nagios-nrpe-server')
1037
1038- for rid in relation_ids("local-monitors"):
1039+ monitor_ids = relation_ids("local-monitors") + \
1040+ relation_ids("nrpe-external-master")
1041+ for rid in monitor_ids:
1042 relation_set(relation_id=rid, monitors=yaml.dump(monitors))
1043
1044
1045
1046=== removed directory 'hooks/charmhelpers/contrib/peerstorage'
1047=== removed file 'hooks/charmhelpers/contrib/peerstorage/__init__.py'
1048--- hooks/charmhelpers/contrib/peerstorage/__init__.py 2015-01-26 13:07:31 +0000
1049+++ hooks/charmhelpers/contrib/peerstorage/__init__.py 1970-01-01 00:00:00 +0000
1050@@ -1,150 +0,0 @@
1051-# Copyright 2014-2015 Canonical Limited.
1052-#
1053-# This file is part of charm-helpers.
1054-#
1055-# charm-helpers is free software: you can redistribute it and/or modify
1056-# it under the terms of the GNU Lesser General Public License version 3 as
1057-# published by the Free Software Foundation.
1058-#
1059-# charm-helpers is distributed in the hope that it will be useful,
1060-# but WITHOUT ANY WARRANTY; without even the implied warranty of
1061-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
1062-# GNU Lesser General Public License for more details.
1063-#
1064-# You should have received a copy of the GNU Lesser General Public License
1065-# along with charm-helpers. If not, see <http://www.gnu.org/licenses/>.
1066-
1067-import six
1068-from charmhelpers.core.hookenv import relation_id as current_relation_id
1069-from charmhelpers.core.hookenv import (
1070- is_relation_made,
1071- relation_ids,
1072- relation_get,
1073- local_unit,
1074- relation_set,
1075-)
1076-
1077-
1078-"""
1079-This helper provides functions to support use of a peer relation
1080-for basic key/value storage, with the added benefit that all storage
1081-can be replicated across peer units.
1082-
1083-Requirement to use:
1084-
1085-To use this, the "peer_echo()" method has to be called form the peer
1086-relation's relation-changed hook:
1087-
1088-@hooks.hook("cluster-relation-changed") # Adapt the to your peer relation name
1089-def cluster_relation_changed():
1090- peer_echo()
1091-
1092-Once this is done, you can use peer storage from anywhere:
1093-
1094-@hooks.hook("some-hook")
1095-def some_hook():
1096- # You can store and retrieve key/values this way:
1097- if is_relation_made("cluster"): # from charmhelpers.core.hookenv
1098- # There are peers available so we can work with peer storage
1099- peer_store("mykey", "myvalue")
1100- value = peer_retrieve("mykey")
1101- print value
1102- else:
1103- print "No peers joind the relation, cannot share key/values :("
1104-"""
1105-
1106-
1107-def peer_retrieve(key, relation_name='cluster'):
1108- """Retrieve a named key from peer relation `relation_name`."""
1109- cluster_rels = relation_ids(relation_name)
1110- if len(cluster_rels) > 0:
1111- cluster_rid = cluster_rels[0]
1112- return relation_get(attribute=key, rid=cluster_rid,
1113- unit=local_unit())
1114- else:
1115- raise ValueError('Unable to detect'
1116- 'peer relation {}'.format(relation_name))
1117-
1118-
1119-def peer_retrieve_by_prefix(prefix, relation_name='cluster', delimiter='_',
1120- inc_list=None, exc_list=None):
1121- """ Retrieve k/v pairs given a prefix and filter using {inc,exc}_list """
1122- inc_list = inc_list if inc_list else []
1123- exc_list = exc_list if exc_list else []
1124- peerdb_settings = peer_retrieve('-', relation_name=relation_name)
1125- matched = {}
1126- if peerdb_settings is None:
1127- return matched
1128- for k, v in peerdb_settings.items():
1129- full_prefix = prefix + delimiter
1130- if k.startswith(full_prefix):
1131- new_key = k.replace(full_prefix, '')
1132- if new_key in exc_list:
1133- continue
1134- if new_key in inc_list or len(inc_list) == 0:
1135- matched[new_key] = v
1136- return matched
1137-
1138-
1139-def peer_store(key, value, relation_name='cluster'):
1140- """Store the key/value pair on the named peer relation `relation_name`."""
1141- cluster_rels = relation_ids(relation_name)
1142- if len(cluster_rels) > 0:
1143- cluster_rid = cluster_rels[0]
1144- relation_set(relation_id=cluster_rid,
1145- relation_settings={key: value})
1146- else:
1147- raise ValueError('Unable to detect '
1148- 'peer relation {}'.format(relation_name))
1149-
1150-
1151-def peer_echo(includes=None):
1152- """Echo filtered attributes back onto the same relation for storage.
1153-
1154- This is a requirement to use the peerstorage module - it needs to be called
1155- from the peer relation's changed hook.
1156- """
1157- rdata = relation_get()
1158- echo_data = {}
1159- if includes is None:
1160- echo_data = rdata.copy()
1161- for ex in ['private-address', 'public-address']:
1162- if ex in echo_data:
1163- echo_data.pop(ex)
1164- else:
1165- for attribute, value in six.iteritems(rdata):
1166- for include in includes:
1167- if include in attribute:
1168- echo_data[attribute] = value
1169- if len(echo_data) > 0:
1170- relation_set(relation_settings=echo_data)
1171-
1172-
1173-def peer_store_and_set(relation_id=None, peer_relation_name='cluster',
1174- peer_store_fatal=False, relation_settings=None,
1175- delimiter='_', **kwargs):
1176- """Store passed-in arguments both in argument relation and in peer storage.
1177-
1178- It functions like doing relation_set() and peer_store() at the same time,
1179- with the same data.
1180-
1181- @param relation_id: the id of the relation to store the data on. Defaults
1182- to the current relation.
1183- @param peer_store_fatal: Set to True, the function will raise an exception
1184- should the peer sotrage not be avialable."""
1185-
1186- relation_settings = relation_settings if relation_settings else {}
1187- relation_set(relation_id=relation_id,
1188- relation_settings=relation_settings,
1189- **kwargs)
1190- if is_relation_made(peer_relation_name):
1191- for key, value in six.iteritems(dict(list(kwargs.items()) +
1192- list(relation_settings.items()))):
1193- key_prefix = relation_id or current_relation_id()
1194- peer_store(key_prefix + delimiter + key,
1195- value,
1196- relation_name=peer_relation_name)
1197- else:
1198- if peer_store_fatal:
1199- raise ValueError('Unable to detect '
1200- 'peer relation {}'.format(peer_relation_name))
1201
1202=== removed directory 'hooks/charmhelpers/contrib/unison'
1203=== removed file 'hooks/charmhelpers/contrib/unison/__init__.py'
1204--- hooks/charmhelpers/contrib/unison/__init__.py 2015-04-03 15:42:21 +0000
1205+++ hooks/charmhelpers/contrib/unison/__init__.py 1970-01-01 00:00:00 +0000
1206@@ -1,298 +0,0 @@
1207-# Copyright 2014-2015 Canonical Limited.
1208-#
1209-# This file is part of charm-helpers.
1210-#
1211-# charm-helpers is free software: you can redistribute it and/or modify
1212-# it under the terms of the GNU Lesser General Public License version 3 as
1213-# published by the Free Software Foundation.
1214-#
1215-# charm-helpers is distributed in the hope that it will be useful,
1216-# but WITHOUT ANY WARRANTY; without even the implied warranty of
1217-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
1218-# GNU Lesser General Public License for more details.
1219-#
1220-# You should have received a copy of the GNU Lesser General Public License
1221-# along with charm-helpers. If not, see <http://www.gnu.org/licenses/>.
1222-
1223-# Easy file synchronization among peer units using ssh + unison.
1224-#
1225-# From *both* peer relation -joined and -changed, add a call to
1226-# ssh_authorized_peers() describing the peer relation and the desired
1227-# user + group. After all peer relations have settled, all hosts should
1228-# be able to connect to on another via key auth'd ssh as the specified user.
1229-#
1230-# Other hooks are then free to synchronize files and directories using
1231-# sync_to_peers().
1232-#
1233-# For a peer relation named 'cluster', for example:
1234-#
1235-# cluster-relation-joined:
1236-# ...
1237-# ssh_authorized_peers(peer_interface='cluster',
1238-# user='juju_ssh', group='juju_ssh',
1239-# ensure_user=True)
1240-# ...
1241-#
1242-# cluster-relation-changed:
1243-# ...
1244-# ssh_authorized_peers(peer_interface='cluster',
1245-# user='juju_ssh', group='juju_ssh',
1246-# ensure_user=True)
1247-# ...
1248-#
1249-# Hooks are now free to sync files as easily as:
1250-#
1251-# files = ['/etc/fstab', '/etc/apt.conf.d/']
1252-# sync_to_peers(peer_interface='cluster',
1253-# user='juju_ssh, paths=[files])
1254-#
1255-# It is assumed the charm itself has setup permissions on each unit
1256-# such that 'juju_ssh' has read + write permissions. Also assumed
1257-# that the calling charm takes care of leader delegation.
1258-#
1259-# Additionally files can be synchronized only to an specific unit:
1260-# sync_to_peer(slave_address, user='juju_ssh',
1261-# paths=[files], verbose=False)
1262-
1263-import os
1264-import pwd
1265-
1266-from copy import copy
1267-from subprocess import check_call, check_output
1268-
1269-from charmhelpers.core.host import (
1270- adduser,
1271- add_user_to_group,
1272- pwgen,
1273-)
1274-
1275-from charmhelpers.core.hookenv import (
1276- log,
1277- hook_name,
1278- relation_ids,
1279- related_units,
1280- relation_set,
1281- relation_get,
1282- unit_private_ip,
1283- INFO,
1284- ERROR,
1285-)
1286-
1287-BASE_CMD = ['unison', '-auto', '-batch=true', '-confirmbigdel=false',
1288- '-fastcheck=true', '-group=false', '-owner=false',
1289- '-prefer=newer', '-times=true']
1290-
1291-
1292-def get_homedir(user):
1293- try:
1294- user = pwd.getpwnam(user)
1295- return user.pw_dir
1296- except KeyError:
1297- log('Could not get homedir for user %s: user exists?' % (user), ERROR)
1298- raise Exception
1299-
1300-
1301-def create_private_key(user, priv_key_path):
1302- if not os.path.isfile(priv_key_path):
1303- log('Generating new SSH key for user %s.' % user)
1304- cmd = ['ssh-keygen', '-q', '-N', '', '-t', 'rsa', '-b', '2048',
1305- '-f', priv_key_path]
1306- check_call(cmd)
1307- else:
1308- log('SSH key already exists at %s.' % priv_key_path)
1309- check_call(['chown', user, priv_key_path])
1310- check_call(['chmod', '0600', priv_key_path])
1311-
1312-
1313-def create_public_key(user, priv_key_path, pub_key_path):
1314- if not os.path.isfile(pub_key_path):
1315- log('Generating missing ssh public key @ %s.' % pub_key_path)
1316- cmd = ['ssh-keygen', '-y', '-f', priv_key_path]
1317- p = check_output(cmd).strip()
1318- with open(pub_key_path, 'wb') as out:
1319- out.write(p)
1320- check_call(['chown', user, pub_key_path])
1321-
1322-
1323-def get_keypair(user):
1324- home_dir = get_homedir(user)
1325- ssh_dir = os.path.join(home_dir, '.ssh')
1326- priv_key = os.path.join(ssh_dir, 'id_rsa')
1327- pub_key = '%s.pub' % priv_key
1328-
1329- if not os.path.isdir(ssh_dir):
1330- os.mkdir(ssh_dir)
1331- check_call(['chown', '-R', user, ssh_dir])
1332-
1333- create_private_key(user, priv_key)
1334- create_public_key(user, priv_key, pub_key)
1335-
1336- with open(priv_key, 'r') as p:
1337- _priv = p.read().strip()
1338-
1339- with open(pub_key, 'r') as p:
1340- _pub = p.read().strip()
1341-
1342- return (_priv, _pub)
1343-
1344-
1345-def write_authorized_keys(user, keys):
1346- home_dir = get_homedir(user)
1347- ssh_dir = os.path.join(home_dir, '.ssh')
1348- auth_keys = os.path.join(ssh_dir, 'authorized_keys')
1349- log('Syncing authorized_keys @ %s.' % auth_keys)
1350- with open(auth_keys, 'w') as out:
1351- for k in keys:
1352- out.write('%s\n' % k)
1353-
1354-
1355-def write_known_hosts(user, hosts):
1356- home_dir = get_homedir(user)
1357- ssh_dir = os.path.join(home_dir, '.ssh')
1358- known_hosts = os.path.join(ssh_dir, 'known_hosts')
1359- khosts = []
1360- for host in hosts:
1361- cmd = ['ssh-keyscan', '-H', '-t', 'rsa', host]
1362- remote_key = check_output(cmd, universal_newlines=True).strip()
1363- khosts.append(remote_key)
1364- log('Syncing known_hosts @ %s.' % known_hosts)
1365- with open(known_hosts, 'w') as out:
1366- for host in khosts:
1367- out.write('%s\n' % host)
1368-
1369-
1370-def ensure_user(user, group=None):
1371- adduser(user, pwgen())
1372- if group:
1373- add_user_to_group(user, group)
1374-
1375-
1376-def ssh_authorized_peers(peer_interface, user, group=None,
1377- ensure_local_user=False):
1378- """
1379- Main setup function, should be called from both peer -changed and -joined
1380- hooks with the same parameters.
1381- """
1382- if ensure_local_user:
1383- ensure_user(user, group)
1384- priv_key, pub_key = get_keypair(user)
1385- hook = hook_name()
1386- if hook == '%s-relation-joined' % peer_interface:
1387- relation_set(ssh_pub_key=pub_key)
1388- elif hook == '%s-relation-changed' % peer_interface:
1389- hosts = []
1390- keys = []
1391-
1392- for r_id in relation_ids(peer_interface):
1393- for unit in related_units(r_id):
1394- ssh_pub_key = relation_get('ssh_pub_key',
1395- rid=r_id,
1396- unit=unit)
1397- priv_addr = relation_get('private-address',
1398- rid=r_id,
1399- unit=unit)
1400- if ssh_pub_key:
1401- keys.append(ssh_pub_key)
1402- hosts.append(priv_addr)
1403- else:
1404- log('ssh_authorized_peers(): ssh_pub_key '
1405- 'missing for unit %s, skipping.' % unit)
1406- write_authorized_keys(user, keys)
1407- write_known_hosts(user, hosts)
1408- authed_hosts = ':'.join(hosts)
1409- relation_set(ssh_authorized_hosts=authed_hosts)
1410-
1411-
1412-def _run_as_user(user, gid=None):
1413- try:
1414- user = pwd.getpwnam(user)
1415- except KeyError:
1416- log('Invalid user: %s' % user)
1417- raise Exception
1418- uid = user.pw_uid
1419- gid = gid or user.pw_gid
1420- os.environ['HOME'] = user.pw_dir
1421-
1422- def _inner():
1423- os.setgid(gid)
1424- os.setuid(uid)
1425- return _inner
1426-
1427-
1428-def run_as_user(user, cmd, gid=None):
1429- return check_output(cmd, preexec_fn=_run_as_user(user, gid), cwd='/')
1430-
1431-
1432-def collect_authed_hosts(peer_interface):
1433- '''Iterate through the units on peer interface to find all that
1434- have the calling host in its authorized hosts list'''
1435- hosts = []
1436- for r_id in (relation_ids(peer_interface) or []):
1437- for unit in related_units(r_id):
1438- private_addr = relation_get('private-address',
1439- rid=r_id, unit=unit)
1440- authed_hosts = relation_get('ssh_authorized_hosts',
1441- rid=r_id, unit=unit)
1442-
1443- if not authed_hosts:
1444- log('Peer %s has not authorized *any* hosts yet, skipping.' %
1445- (unit), level=INFO)
1446- continue
1447-
1448- if unit_private_ip() in authed_hosts.split(':'):
1449- hosts.append(private_addr)
1450- else:
1451- log('Peer %s has not authorized *this* host yet, skipping.' %
1452- (unit), level=INFO)
1453- return hosts
1454-
1455-
1456-def sync_path_to_host(path, host, user, verbose=False, cmd=None, gid=None,
1457- fatal=False):
1458- """Sync path to an specific peer host
1459-
1460- Propagates exception if operation fails and fatal=True.
1461- """
1462- cmd = cmd or copy(BASE_CMD)
1463- if not verbose:
1464- cmd.append('-silent')
1465-
1466- # removing trailing slash from directory paths, unison
1467- # doesn't like these.
1468- if path.endswith('/'):
1469- path = path[:(len(path) - 1)]
1470-
1471- cmd = cmd + [path, 'ssh://%s@%s/%s' % (user, host, path)]
1472-
1473- try:
1474- log('Syncing local path %s to %s@%s:%s' % (path, user, host, path))
1475- run_as_user(user, cmd, gid)
1476- except:
1477- log('Error syncing remote files')
1478- if fatal:
1479- raise
1480-
1481-
1482-def sync_to_peer(host, user, paths=None, verbose=False, cmd=None, gid=None,
1483- fatal=False):
1484- """Sync paths to an specific peer host
1485-
1486- Propagates exception if any operation fails and fatal=True.
1487- """
1488- if paths:
1489- for p in paths:
1490- sync_path_to_host(p, host, user, verbose, cmd, gid, fatal)
1491-
1492-
1493-def sync_to_peers(peer_interface, user, paths=None, verbose=False, cmd=None,
1494- gid=None, fatal=False):
1495- """Sync all hosts to an specific path
1496-
1497- The type of group is integer, it allows user has permissions to
1498- operate a directory have a different group id with the user id.
1499-
1500- Propagates exception if any operation fails and fatal=True.
1501- """
1502- if paths:
1503- for host in collect_authed_hosts(peer_interface):
1504- sync_to_peer(host, user, paths, verbose, cmd, gid, fatal)
1505
1506=== added file 'hooks/charmhelpers/coordinator.py'
1507--- hooks/charmhelpers/coordinator.py 1970-01-01 00:00:00 +0000
1508+++ hooks/charmhelpers/coordinator.py 2015-06-30 07:36:26 +0000
1509@@ -0,0 +1,607 @@
1510+# Copyright 2014-2015 Canonical Limited.
1511+#
1512+# This file is part of charm-helpers.
1513+#
1514+# charm-helpers is free software: you can redistribute it and/or modify
1515+# it under the terms of the GNU Lesser General Public License version 3 as
1516+# published by the Free Software Foundation.
1517+#
1518+# charm-helpers is distributed in the hope that it will be useful,
1519+# but WITHOUT ANY WARRANTY; without even the implied warranty of
1520+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
1521+# GNU Lesser General Public License for more details.
1522+#
1523+# You should have received a copy of the GNU Lesser General Public License
1524+# along with charm-helpers. If not, see <http://www.gnu.org/licenses/>.
1525+'''
1526+The coordinator module allows you to use Juju's leadership feature to
1527+coordinate operations between units of a service.
1528+
1529+Behavior is defined in subclasses of coordinator.BaseCoordinator.
1530+One implementation is provided (coordinator.Serial), which allows an
1531+operation to be run on a single unit at a time, on a first come, first
1532+served basis. You can trivially define more complex behavior by
1533+subclassing BaseCoordinator or Serial.
1534+
1535+:author: Stuart Bishop <stuart.bishop@canonical.com>
1536+
1537+
1538+Services Framework Usage
1539+========================
1540+
1541+Ensure a peer relation is defined in metadata.yaml. Instantiate a
1542+BaseCoordinator subclass before invoking ServiceManager.manage().
1543+Ensure that ServiceManager.manage() is wired up to the leader-elected,
1544+leader-settings-changed, peer relation-changed and peer
1545+relation-departed hooks in addition to any other hooks you need, or your
1546+service will deadlock.
1547+
1548+Ensure calls to acquire() are guarded, so that locks are only requested
1549+when they are really needed (and thus hooks only triggered when necessary).
1550+Failing to do this and calling acquire() unconditionally will put your unit
1551+into a hook loop. Calls to granted() do not need to be guarded.
1552+
1553+For example::
1554+
1555+ from charmhelpers.core import hookenv, services
1556+ from charmhelpers import coordinator
1557+
1558+ def maybe_restart(servicename):
1559+ serial = coordinator.Serial()
1560+ if needs_restart():
1561+ serial.acquire('restart')
1562+ if serial.granted('restart'):
1563+ hookenv.service_restart(servicename)
1564+
1565+ services = [dict(service='servicename',
1566+ data_ready=[maybe_restart])]
1567+
1568+ if __name__ == '__main__':
1569+ _ = coordinator.Serial() # Must instantiate before manager.manage()
1570+ manager = services.ServiceManager(services)
1571+ manager.manage()
1572+
1573+
1574+You can implement a similar pattern using a decorator. If the lock has
1575+not been granted, an attempt to acquire() it will be made if the guard
1576+function returns True. If the lock has been granted, the decorated function
1577+is run as normal::
1578+
1579+ from charmhelpers.core import hookenv, services
1580+ from charmhelpers import coordinator
1581+
1582+ serial = coordinator.Serial() # Global, instatiated on module import.
1583+
1584+ def needs_restart():
1585+ [ ... Introspect state. Return True if restart is needed ... ]
1586+
1587+ @serial.require('restart', needs_restart)
1588+ def maybe_restart(servicename):
1589+ hookenv.service_restart(servicename)
1590+
1591+ services = [dict(service='servicename',
1592+ data_ready=[maybe_restart])]
1593+
1594+ if __name__ == '__main__':
1595+ manager = services.ServiceManager(services)
1596+ manager.manage()
1597+
1598+
1599+Traditional Usage
1600+=================
1601+
1602+Ensure a peer relationis defined in metadata.yaml.
1603+
1604+If you are using charmhelpers.core.hookenv.Hooks, ensure that a
1605+BaseCoordinator subclass is instantiated before calling Hooks.execute.
1606+
1607+If you are not using charmhelpers.core.hookenv.Hooks, ensure
1608+that a BaseCoordinator subclass is instantiated and its handle()
1609+method called at the start of all your hooks.
1610+
1611+For example::
1612+
1613+ import sys
1614+ from charmhelpers.core import hookenv
1615+ from charmhelpers import coordinator
1616+
1617+ hooks = hookenv.Hooks()
1618+
1619+ def maybe_restart():
1620+ serial = coordinator.Serial()
1621+ if serial.granted('restart'):
1622+ hookenv.service_restart('myservice')
1623+
1624+ @hooks.hook
1625+ def config_changed():
1626+ update_config()
1627+ serial = coordinator.Serial()
1628+ if needs_restart():
1629+ serial.acquire('restart'):
1630+ maybe_restart()
1631+
1632+ # Cluster hooks must be wired up.
1633+ @hooks.hook('cluster-relation-changed', 'cluster-relation-departed')
1634+ def cluster_relation_changed():
1635+ maybe_restart()
1636+
1637+ # Leader hooks must be wired up.
1638+ @hooks.hook('leader-elected', 'leader-settings-changed')
1639+ def leader_settings_changed():
1640+ maybe_restart()
1641+
1642+ [ ... repeat for *all* other hooks you are using ... ]
1643+
1644+ if __name__ == '__main__':
1645+ _ = coordinator.Serial() # Must instantiate before execute()
1646+ hooks.execute(sys.argv)
1647+
1648+
1649+You can also use the require decorator. If the lock has not been granted,
1650+an attempt to acquire() it will be made if the guard function returns True.
1651+If the lock has been granted, the decorated function is run as normal::
1652+
1653+ from charmhelpers.core import hookenv
1654+
1655+ hooks = hookenv.Hooks()
1656+ serial = coordinator.Serial() # Must instantiate before execute()
1657+
1658+ @require('restart', needs_restart)
1659+ def maybe_restart():
1660+ hookenv.service_restart('myservice')
1661+
1662+ @hooks.hook('install', 'config-changed', 'upgrade-charm',
1663+ # Peer and leader hooks must be wired up.
1664+ 'cluster-relation-changed', 'cluster-relation-departed',
1665+ 'leader-elected', 'leader-settings-changed')
1666+ def default_hook():
1667+ [...]
1668+ maybe_restart()
1669+
1670+ if __name__ == '__main__':
1671+ hooks.execute()
1672+
1673+
1674+Details
1675+=======
1676+
1677+A simple API is provided similar to traditional locking APIs. A lock
1678+may be requested using the acquire() method, and the granted() method
1679+may be used do to check if a lock previously requested by acquire() has
1680+been granted. It doesn't matter how many times acquire() is called in a
1681+hook.
1682+
1683+Locks are released at the end of the hook they are acquired in. This may
1684+be the current hook if the unit is leader and the lock is free. It is
1685+more likely a future hook (probably leader-settings-changed, possibly
1686+the peer relation-changed or departed hook, potentially any hook).
1687+
1688+Whenever a charm needs to perform a coordinated action it will acquire()
1689+the lock and perform the action immediately if acquisition is
1690+successful. It will also need to perform the same action in every other
1691+hook if the lock has been granted.
1692+
1693+
1694+Grubby Details
1695+--------------
1696+
1697+Why do you need to be able to perform the same action in every hook?
1698+If the unit is the leader, then it may be able to grant its own lock
1699+and perform the action immediately in the source hook. If the unit is
1700+the leader and cannot immediately grant the lock, then its only
1701+guaranteed chance of acquiring the lock is in the peer relation-joined,
1702+relation-changed or peer relation-departed hooks when another unit has
1703+released it (the only channel to communicate to the leader is the peer
1704+relation). If the unit is not the leader, then it is unlikely the lock
1705+is granted in the source hook (a previous hook must have also made the
1706+request for this to happen). A non-leader is notified about the lock via
1707+leader settings. These changes may be visible in any hook, even before
1708+the leader-settings-changed hook has been invoked. Or the requesting
1709+unit may be promoted to leader after making a request, in which case the
1710+lock may be granted in leader-elected or in a future peer
1711+relation-changed or relation-departed hook.
1712+
1713+This could be simpler if leader-settings-changed was invoked on the
1714+leader. We could then never grant locks except in
1715+leader-settings-changed hooks giving one place for the operation to be
1716+performed. Unfortunately this is not the case with Juju 1.23 leadership.
1717+
1718+But of course, this doesn't really matter to most people as most people
1719+seem to prefer the Services Framework or similar reset-the-world
1720+approaches, rather than the twisty maze of attempting to deduce what
1721+should be done based on what hook happens to be running (which always
1722+seems to evolve into reset-the-world anyway when the charm grows beyond
1723+the trivial).
1724+
1725+I chose not to implement a callback model, where a callback was passed
1726+to acquire to be executed when the lock is granted, because the callback
1727+may become invalid between making the request and the lock being granted
1728+due to an upgrade-charm being run in the interim. And it would create
1729+restrictions, such no lambdas, callback defined at the top level of a
1730+module, etc. Still, we could implement it on top of what is here, eg.
1731+by adding a defer decorator that stores a pickle of itself to disk and
1732+have BaseCoordinator unpickle and execute them when the locks are granted.
1733+'''
1734+from datetime import datetime
1735+from functools import wraps
1736+import json
1737+import os.path
1738+
1739+from six import with_metaclass
1740+
1741+from charmhelpers.core import hookenv
1742+
1743+
1744+# We make BaseCoordinator and subclasses singletons, so that if we
1745+# need to spill to local storage then only a single instance does so,
1746+# rather than having multiple instances stomp over each other.
1747+class Singleton(type):
1748+ _instances = {}
1749+
1750+ def __call__(cls, *args, **kwargs):
1751+ if cls not in cls._instances:
1752+ cls._instances[cls] = super(Singleton, cls).__call__(*args,
1753+ **kwargs)
1754+ return cls._instances[cls]
1755+
1756+
1757+class BaseCoordinator(with_metaclass(Singleton, object)):
1758+ relid = None # Peer relation-id, set by __init__
1759+ relname = None
1760+
1761+ grants = None # self.grants[unit][lock] == timestamp
1762+ requests = None # self.requests[unit][lock] == timestamp
1763+
1764+ def __init__(self, relation_key='coordinator', peer_relation_name=None):
1765+ '''Instatiate a Coordinator.
1766+
1767+ Data is stored on the peer relation and in leadership storage
1768+ under the provided relation_key.
1769+
1770+ The peer relation is identified by peer_relation_name, and defaults
1771+ to the first one found in metadata.yaml.
1772+ '''
1773+ # Most initialization is deferred, since invoking hook tools from
1774+ # the constructor makes testing hard.
1775+ self.key = relation_key
1776+ self.relname = peer_relation_name
1777+ hookenv.atstart(self.initialize)
1778+
1779+ # Ensure that handle() is called, without placing that burden on
1780+ # the charm author. They still need to do this manually if they
1781+ # are not using a hook framework.
1782+ hookenv.atstart(self.handle)
1783+
1784+ def initialize(self):
1785+ if self.requests is not None:
1786+ return # Already initialized.
1787+
1788+ assert hookenv.has_juju_version('1.23'), 'Needs Juju 1.23+'
1789+
1790+ if self.relname is None:
1791+ self.relname = _implicit_peer_relation_name()
1792+
1793+ relids = hookenv.relation_ids(self.relname)
1794+ if relids:
1795+ self.relid = sorted(relids)[0]
1796+
1797+ # Load our state, from leadership, the peer relationship, and maybe
1798+ # local state as a fallback. Populates self.requests and self.grants.
1799+ self._load_state()
1800+ self._emit_state()
1801+
1802+ # Save our state if the hook completes successfully.
1803+ hookenv.atexit(self._save_state)
1804+
1805+ # Schedule release of granted locks for the end of the hook.
1806+ # This needs to be the last of our atexit callbacks to ensure
1807+ # it will be run first when the hook is complete, because there
1808+ # is no point mutating our state after it has been saved.
1809+ hookenv.atexit(self._release_granted)
1810+
1811+ def acquire(self, lock):
1812+ '''Acquire the named lock, non-blocking.
1813+
1814+ The lock may be granted immediately, or in a future hook.
1815+
1816+ Returns True if the lock has been granted. The lock will be
1817+ automatically released at the end of the hook in which it is
1818+ granted.
1819+
1820+ Do not mindlessly call this method, as it triggers a cascade of
1821+ hooks. For example, if you call acquire() every time in your
1822+ peer relation-changed hook you will end up with an infinite loop
1823+ of hooks. It should almost always be guarded by some condition.
1824+ '''
1825+ unit = hookenv.local_unit()
1826+ ts = self.requests[unit].get(lock)
1827+ if not ts:
1828+ # If there is no outstanding request on the peer relation,
1829+ # create one.
1830+ self.requests.setdefault(lock, {})
1831+ self.requests[unit][lock] = _timestamp()
1832+ self.msg('Requested {}'.format(lock))
1833+
1834+ # If the leader has granted the lock, yay.
1835+ if self.granted(lock):
1836+ self.msg('Acquired {}'.format(lock))
1837+ return True
1838+
1839+ # If the unit making the request also happens to be the
1840+ # leader, it must handle the request now. Even though the
1841+ # request has been stored on the peer relation, the peer
1842+ # relation-changed hook will not be triggered.
1843+ if hookenv.is_leader():
1844+ return self.grant(lock, unit)
1845+
1846+ return False # Can't acquire lock, yet. Maybe next hook.
1847+
1848+ def granted(self, lock):
1849+ '''Return True if a previously requested lock has been granted'''
1850+ unit = hookenv.local_unit()
1851+ ts = self.requests[unit].get(lock)
1852+ if ts and self.grants.get(unit, {}).get(lock) == ts:
1853+ return True
1854+ return False
1855+
1856+ def requested(self, lock):
1857+ '''Return True if we are in the queue for the lock'''
1858+ return lock in self.requests[hookenv.local_unit()]
1859+
1860+ def request_timestamp(self, lock):
1861+ '''Return the timestamp of our outstanding request for lock, or None.
1862+
1863+ Returns a datetime.datetime() UTC timestamp, with no tzinfo attribute.
1864+ '''
1865+ ts = self.requests[hookenv.local_unit()].get(lock, None)
1866+ if ts is not None:
1867+ return datetime.strptime(ts, _timestamp_format)
1868+
1869+ def handle(self):
1870+ if not hookenv.is_leader():
1871+ return # Only the leader can grant requests.
1872+
1873+ self.msg('Leader handling coordinator requests')
1874+
1875+ # Clear our grants that have been released.
1876+ for unit in self.grants.keys():
1877+ for lock, grant_ts in list(self.grants[unit].items()):
1878+ req_ts = self.requests.get(unit, {}).get(lock)
1879+ if req_ts != grant_ts:
1880+ # The request timestamp does not match the granted
1881+ # timestamp. Several hooks on 'unit' may have run
1882+ # before the leader got a chance to make a decision,
1883+ # and 'unit' may have released its lock and attempted
1884+ # to reacquire it. This will change the timestamp,
1885+ # and we correctly revoke the old grant putting it
1886+ # to the end of the queue.
1887+ ts = datetime.strptime(self.grants[unit][lock],
1888+ _timestamp_format)
1889+ del self.grants[unit][lock]
1890+ self.released(unit, lock, ts)
1891+
1892+ # Grant locks
1893+ for unit in self.requests.keys():
1894+ for lock in self.requests[unit]:
1895+ self.grant(lock, unit)
1896+
1897+ def grant(self, lock, unit):
1898+ '''Maybe grant the lock to a unit.
1899+
1900+ The decision to grant the lock or not is made for $lock
1901+ by a corresponding method grant_$lock, which you may define
1902+ in a subclass. If no such method is defined, the default_grant
1903+ method is used. See Serial.default_grant() for details.
1904+ '''
1905+ if not hookenv.is_leader():
1906+ return False # Not the leader, so we cannot grant.
1907+
1908+ # Set of units already granted the lock.
1909+ granted = set()
1910+ for u in self.grants:
1911+ if lock in self.grants[u]:
1912+ granted.add(u)
1913+ if unit in granted:
1914+ return True # Already granted.
1915+
1916+ # Ordered list of units waiting for the lock.
1917+ reqs = set()
1918+ for u in self.requests:
1919+ if u in granted:
1920+ continue # In the granted set. Not wanted in the req list.
1921+ for l, ts in self.requests[u].items():
1922+ if l == lock:
1923+ reqs.add((ts, u))
1924+ queue = [t[1] for t in sorted(reqs)]
1925+ if unit not in queue:
1926+ return False # Unit has not requested the lock.
1927+
1928+ # Locate custom logic, or fallback to the default.
1929+ grant_func = getattr(self, 'grant_{}'.format(lock), self.default_grant)
1930+
1931+ if grant_func(lock, unit, granted, queue):
1932+ # Grant the lock.
1933+ self.msg('Leader grants {} to {}'.format(lock, unit))
1934+ self.grants.setdefault(unit, {})[lock] = self.requests[unit][lock]
1935+ return True
1936+
1937+ return False
1938+
1939+ def released(self, unit, lock, timestamp):
1940+ '''Called on the leader when it has released a lock.
1941+
1942+ By default, does nothing but log messages. Override if you
1943+ need to perform additional housekeeping when a lock is released,
1944+ for example recording timestamps.
1945+ '''
1946+ interval = _utcnow() - timestamp
1947+ self.msg('Leader released {} from {}, held {}'.format(lock, unit,
1948+ interval))
1949+
1950+ def require(self, lock, guard_func, *guard_args, **guard_kw):
1951+ """Decorate a function to be run only when a lock is acquired.
1952+
1953+ The lock is requested if the guard function returns True.
1954+
1955+ The decorated function is called if the lock has been granted.
1956+ """
1957+ def decorator(f):
1958+ @wraps(f)
1959+ def wrapper(*args, **kw):
1960+ if self.granted(lock):
1961+ self.msg('Granted {}'.format(lock))
1962+ return f(*args, **kw)
1963+ if guard_func(*guard_args, **guard_kw) and self.acquire(lock):
1964+ return f(*args, **kw)
1965+ return None
1966+ return wrapper
1967+ return decorator
1968+
1969+ def msg(self, msg):
1970+ '''Emit a message. Override to customize log spam.'''
1971+ hookenv.log('coordinator.{} {}'.format(self._name(), msg),
1972+ level=hookenv.INFO)
1973+
1974+ def _name(self):
1975+ return self.__class__.__name__
1976+
1977+ def _load_state(self):
1978+ self.msg('Loading state'.format(self._name()))
1979+
1980+ # All responses must be stored in the leadership settings.
1981+ # The leader cannot use local state, as a different unit may
1982+ # be leader next time. Which is fine, as the leadership
1983+ # settings are always available.
1984+ self.grants = json.loads(hookenv.leader_get(self.key) or '{}')
1985+
1986+ local_unit = hookenv.local_unit()
1987+
1988+ # All requests must be stored on the peer relation. This is
1989+ # the only channel units have to communicate with the leader.
1990+ # Even the leader needs to store its requests here, as a
1991+ # different unit may be leader by the time the request can be
1992+ # granted.
1993+ if self.relid is None:
1994+ # The peer relation is not available. Maybe we are early in
1995+ # the units's lifecycle. Maybe this unit is standalone.
1996+ # Fallback to using local state.
1997+ self.msg('No peer relation. Loading local state')
1998+ self.requests = {local_unit: self._load_local_state()}
1999+ else:
2000+ self.requests = self._load_peer_state()
2001+ if local_unit not in self.requests:
2002+ # The peer relation has just been joined. Update any state
2003+ # loaded from our peers with our local state.
2004+ self.msg('New peer relation. Merging local state')
2005+ self.requests[local_unit] = self._load_local_state()
2006+
2007+ def _emit_state(self):
2008+ # Emit this units lock status.
2009+ for lock in sorted(self.requests[hookenv.local_unit()].keys()):
2010+ if self.granted(lock):
2011+ self.msg('Granted {}'.format(lock))
2012+ else:
2013+ self.msg('Waiting on {}'.format(lock))
2014+
2015+ def _save_state(self):
2016+ self.msg('Publishing state'.format(self._name()))
2017+ if hookenv.is_leader():
2018+ # sort_keys to ensure stability.
2019+ raw = json.dumps(self.grants, sort_keys=True)
2020+ hookenv.leader_set({self.key: raw})
2021+
2022+ local_unit = hookenv.local_unit()
2023+
2024+ if self.relid is None:
2025+ # No peer relation yet. Fallback to local state.
2026+ self.msg('No peer relation. Saving local state')
2027+ self._save_local_state(self.requests[local_unit])
2028+ else:
2029+ # sort_keys to ensure stability.
2030+ raw = json.dumps(self.requests[local_unit], sort_keys=True)
2031+ hookenv.relation_set(self.relid, relation_settings={self.key: raw})
2032+
2033+ def _load_peer_state(self):
2034+ requests = {}
2035+ units = set(hookenv.related_units(self.relid))
2036+ units.add(hookenv.local_unit())
2037+ for unit in units:
2038+ raw = hookenv.relation_get(self.key, unit, self.relid)
2039+ if raw:
2040+ requests[unit] = json.loads(raw)
2041+ return requests
2042+
2043+ def _local_state_filename(self):
2044+ # Include the class name. We allow multiple BaseCoordinator
2045+ # subclasses to be instantiated, and they are singletons, so
2046+ # this avoids conflicts (unless someone creates and uses two
2047+ # BaseCoordinator subclasses with the same class name, so don't
2048+ # do that).
2049+ return '.charmhelpers.coordinator.{}'.format(self._name())
2050+
2051+ def _load_local_state(self):
2052+ fn = self._local_state_filename()
2053+ if os.path.exists(fn):
2054+ with open(fn, 'r') as f:
2055+ return json.load(f)
2056+ return {}
2057+
2058+ def _save_local_state(self, state):
2059+ fn = self._local_state_filename()
2060+ with open(fn, 'w') as f:
2061+ json.dump(state, f)
2062+
2063+ def _release_granted(self):
2064+ # At the end of every hook, release all locks granted to
2065+ # this unit. If a hook neglects to make use of what it
2066+ # requested, it will just have to make the request again.
2067+ # Implicit release is the only way this will work, as
2068+ # if the unit is standalone there may be no future triggers
2069+ # called to do a manual release.
2070+ unit = hookenv.local_unit()
2071+ for lock in list(self.requests[unit].keys()):
2072+ if self.granted(lock):
2073+ self.msg('Released local {} lock'.format(lock))
2074+ del self.requests[unit][lock]
2075+
2076+
2077+class Serial(BaseCoordinator):
2078+ def default_grant(self, lock, unit, granted, queue):
2079+ '''Default logic to grant a lock to a unit. Unless overridden,
2080+ only one unit may hold the lock and it will be granted to the
2081+ earliest queued request.
2082+
2083+ To define custom logic for $lock, create a subclass and
2084+ define a grant_$lock method.
2085+
2086+ `unit` is the unit name making the request.
2087+
2088+ `granted` is the set of units already granted the lock. It will
2089+ never include `unit`. It may be empty.
2090+
2091+ `queue` is the list of units waiting for the lock, ordered by time
2092+ of request. It will always include `unit`, but `unit` is not
2093+ necessarily first.
2094+
2095+ Returns True if the lock should be granted to `unit`.
2096+ '''
2097+ return unit == queue[0] and not granted
2098+
2099+
2100+def _implicit_peer_relation_name():
2101+ md = hookenv.metadata()
2102+ assert 'peers' in md, 'No peer relations in metadata.yaml'
2103+ return sorted(md['peers'].keys())[0]
2104+
2105+
2106+# A human readable, sortable UTC timestamp format.
2107+_timestamp_format = '%Y-%m-%d %H:%M:%S.%fZ'
2108+
2109+
2110+def _utcnow(): # pragma: no cover
2111+ # This wrapper exists as mocking datetime methods is problematic.
2112+ return datetime.utcnow()
2113+
2114+
2115+def _timestamp():
2116+ return _utcnow().strftime(_timestamp_format)
2117
2118=== modified file 'hooks/charmhelpers/core/hookenv.py'
2119--- hooks/charmhelpers/core/hookenv.py 2015-04-03 15:42:21 +0000
2120+++ hooks/charmhelpers/core/hookenv.py 2015-06-30 07:36:26 +0000
2121@@ -20,12 +20,17 @@
2122 # Authors:
2123 # Charm Helpers Developers <juju@lists.ubuntu.com>
2124
2125+from __future__ import print_function
2126+from distutils.version import LooseVersion
2127 from functools import wraps
2128+import glob
2129 import os
2130 import json
2131 import yaml
2132 import subprocess
2133 import sys
2134+import errno
2135+import tempfile
2136 from subprocess import CalledProcessError
2137
2138 import six
2139@@ -90,7 +95,18 @@
2140 if not isinstance(message, six.string_types):
2141 message = repr(message)
2142 command += [message]
2143- subprocess.call(command)
2144+ # Missing juju-log should not cause failures in unit tests
2145+ # Send log output to stderr
2146+ try:
2147+ subprocess.call(command)
2148+ except OSError as e:
2149+ if e.errno == errno.ENOENT:
2150+ if level:
2151+ message = "{}: {}".format(level, message)
2152+ message = "juju-log: {}".format(message)
2153+ print(message, file=sys.stderr)
2154+ else:
2155+ raise
2156
2157
2158 class Serializable(UserDict):
2159@@ -228,6 +244,7 @@
2160 self.path = os.path.join(charm_dir(), Config.CONFIG_FILE_NAME)
2161 if os.path.exists(self.path):
2162 self.load_previous()
2163+ atexit(self._implicit_save)
2164
2165 def __getitem__(self, key):
2166 """For regular dict lookups, check the current juju config first,
2167@@ -307,6 +324,10 @@
2168 with open(self.path, 'w') as f:
2169 json.dump(self, f)
2170
2171+ def _implicit_save(self):
2172+ if self.implicit_save:
2173+ self.save()
2174+
2175
2176 @cached
2177 def config(scope=None):
2178@@ -349,18 +370,49 @@
2179 """Set relation information for the current unit"""
2180 relation_settings = relation_settings if relation_settings else {}
2181 relation_cmd_line = ['relation-set']
2182+ accepts_file = "--file" in subprocess.check_output(
2183+ relation_cmd_line + ["--help"], universal_newlines=True)
2184 if relation_id is not None:
2185 relation_cmd_line.extend(('-r', relation_id))
2186- for k, v in (list(relation_settings.items()) + list(kwargs.items())):
2187- if v is None:
2188- relation_cmd_line.append('{}='.format(k))
2189- else:
2190- relation_cmd_line.append('{}={}'.format(k, v))
2191- subprocess.check_call(relation_cmd_line)
2192+ settings = relation_settings.copy()
2193+ settings.update(kwargs)
2194+ for key, value in settings.items():
2195+ # Force value to be a string: it always should, but some call
2196+ # sites pass in things like dicts or numbers.
2197+ if value is not None:
2198+ settings[key] = "{}".format(value)
2199+ if accepts_file:
2200+ # --file was introduced in Juju 1.23.2. Use it by default if
2201+ # available, since otherwise we'll break if the relation data is
2202+ # too big. Ideally we should tell relation-set to read the data from
2203+ # stdin, but that feature is broken in 1.23.2: Bug #1454678.
2204+ with tempfile.NamedTemporaryFile(delete=False) as settings_file:
2205+ settings_file.write(yaml.safe_dump(settings).encode("utf-8"))
2206+ subprocess.check_call(
2207+ relation_cmd_line + ["--file", settings_file.name])
2208+ os.remove(settings_file.name)
2209+ else:
2210+ for key, value in settings.items():
2211+ if value is None:
2212+ relation_cmd_line.append('{}='.format(key))
2213+ else:
2214+ relation_cmd_line.append('{}={}'.format(key, value))
2215+ subprocess.check_call(relation_cmd_line)
2216 # Flush cache of any relation-gets for local unit
2217 flush(local_unit())
2218
2219
2220+def relation_clear(r_id=None):
2221+ ''' Clears any relation data already set on relation r_id '''
2222+ settings = relation_get(rid=r_id,
2223+ unit=local_unit())
2224+ for setting in settings:
2225+ if setting not in ['public-address', 'private-address']:
2226+ settings[setting] = None
2227+ relation_set(relation_id=r_id,
2228+ **settings)
2229+
2230+
2231 @cached
2232 def relation_ids(reltype=None):
2233 """A list of relation_ids"""
2234@@ -542,10 +594,14 @@
2235 hooks.execute(sys.argv)
2236 """
2237
2238- def __init__(self, config_save=True):
2239+ def __init__(self, config_save=None):
2240 super(Hooks, self).__init__()
2241 self._hooks = {}
2242- self._config_save = config_save
2243+
2244+ # For unknown reasons, we allow the Hooks constructor to override
2245+ # config().implicit_save.
2246+ if config_save is not None:
2247+ config().implicit_save = config_save
2248
2249 def register(self, name, function):
2250 """Register a hook"""
2251@@ -553,13 +609,16 @@
2252
2253 def execute(self, args):
2254 """Execute a registered hook based on args[0]"""
2255+ _run_atstart()
2256 hook_name = os.path.basename(args[0])
2257 if hook_name in self._hooks:
2258- self._hooks[hook_name]()
2259- if self._config_save:
2260- cfg = config()
2261- if cfg.implicit_save:
2262- cfg.save()
2263+ try:
2264+ self._hooks[hook_name]()
2265+ except SystemExit as x:
2266+ if x.code is None or x.code == 0:
2267+ _run_atexit()
2268+ raise
2269+ _run_atexit()
2270 else:
2271 raise UnregisteredHookError(hook_name)
2272
2273@@ -606,3 +665,160 @@
2274
2275 The results set by action_set are preserved."""
2276 subprocess.check_call(['action-fail', message])
2277+
2278+
2279+def status_set(workload_state, message):
2280+ """Set the workload state with a message
2281+
2282+ Use status-set to set the workload state with a message which is visible
2283+ to the user via juju status. If the status-set command is not found then
2284+ assume this is juju < 1.23 and juju-log the message unstead.
2285+
2286+ workload_state -- valid juju workload state.
2287+ message -- status update message
2288+ """
2289+ valid_states = ['maintenance', 'blocked', 'waiting', 'active']
2290+ if workload_state not in valid_states:
2291+ raise ValueError(
2292+ '{!r} is not a valid workload state'.format(workload_state)
2293+ )
2294+ cmd = ['status-set', workload_state, message]
2295+ try:
2296+ ret = subprocess.call(cmd)
2297+ if ret == 0:
2298+ return
2299+ except OSError as e:
2300+ if e.errno != errno.ENOENT:
2301+ raise
2302+ log_message = 'status-set failed: {} {}'.format(workload_state,
2303+ message)
2304+ log(log_message, level='INFO')
2305+
2306+
2307+def status_get():
2308+ """Retrieve the previously set juju workload state
2309+
2310+ If the status-set command is not found then assume this is juju < 1.23 and
2311+ return 'unknown'
2312+ """
2313+ cmd = ['status-get']
2314+ try:
2315+ raw_status = subprocess.check_output(cmd, universal_newlines=True)
2316+ status = raw_status.rstrip()
2317+ return status
2318+ except OSError as e:
2319+ if e.errno == errno.ENOENT:
2320+ return 'unknown'
2321+ else:
2322+ raise
2323+
2324+
2325+def translate_exc(from_exc, to_exc):
2326+ def inner_translate_exc1(f):
2327+ def inner_translate_exc2(*args, **kwargs):
2328+ try:
2329+ return f(*args, **kwargs)
2330+ except from_exc:
2331+ raise to_exc
2332+
2333+ return inner_translate_exc2
2334+
2335+ return inner_translate_exc1
2336+
2337+
2338+@translate_exc(from_exc=OSError, to_exc=NotImplementedError)
2339+def is_leader():
2340+ """Does the current unit hold the juju leadership
2341+
2342+ Uses juju to determine whether the current unit is the leader of its peers
2343+ """
2344+ cmd = ['is-leader', '--format=json']
2345+ return json.loads(subprocess.check_output(cmd).decode('UTF-8'))
2346+
2347+
2348+@translate_exc(from_exc=OSError, to_exc=NotImplementedError)
2349+def leader_get(attribute=None):
2350+ """Juju leader get value(s)"""
2351+ cmd = ['leader-get', '--format=json'] + [attribute or '-']
2352+ return json.loads(subprocess.check_output(cmd).decode('UTF-8'))
2353+
2354+
2355+@translate_exc(from_exc=OSError, to_exc=NotImplementedError)
2356+def leader_set(settings=None, **kwargs):
2357+ """Juju leader set value(s)"""
2358+ # Don't log secrets.
2359+ # log("Juju leader-set '%s'" % (settings), level=DEBUG)
2360+ cmd = ['leader-set']
2361+ settings = settings or {}
2362+ settings.update(kwargs)
2363+ for k, v in settings.items():
2364+ if v is None:
2365+ cmd.append('{}='.format(k))
2366+ else:
2367+ cmd.append('{}={}'.format(k, v))
2368+ subprocess.check_call(cmd)
2369+
2370+
2371+@cached
2372+def juju_version():
2373+ """Full version string (eg. '1.23.3.1-trusty-amd64')"""
2374+ # Per https://bugs.launchpad.net/juju-core/+bug/1455368/comments/1
2375+ jujud = glob.glob('/var/lib/juju/tools/machine-*/jujud')[0]
2376+ return subprocess.check_output([jujud, 'version'],
2377+ universal_newlines=True).strip()
2378+
2379+
2380+@cached
2381+def has_juju_version(minimum_version):
2382+ """Return True if the Juju version is at least the provided version"""
2383+ return LooseVersion(juju_version()) >= LooseVersion(minimum_version)
2384+
2385+
2386+_atexit = []
2387+_atstart = []
2388+
2389+
2390+def atstart(callback, *args, **kwargs):
2391+ '''Schedule a callback to run before the main hook.
2392+
2393+ Callbacks are run in the order they were added.
2394+
2395+ This is useful for modules and classes to perform initialization
2396+ and inject behavior. In particular:
2397+ - Run common code before all of your hooks, such as logging
2398+ the hook name or interesting relation data.
2399+ - Defer object or module initialization that requires a hook
2400+ context until we know there actually is a hook context,
2401+ making testing easier.
2402+ - Rather than requiring charm authors to include boilerplate to
2403+ invoke your helper's behavior, have it run automatically if
2404+ your object is instantiated or module imported.
2405+
2406+ This is not at all useful after your hook framework as been launched.
2407+ '''
2408+ global _atstart
2409+ _atstart.append((callback, args, kwargs))
2410+
2411+
2412+def atexit(callback, *args, **kwargs):
2413+ '''Schedule a callback to run on successful hook completion.
2414+
2415+ Callbacks are run in the reverse order that they were added.'''
2416+ _atexit.append((callback, args, kwargs))
2417+
2418+
2419+def _run_atstart():
2420+ '''Hook frameworks must invoke this before running the main hook body.'''
2421+ global _atstart
2422+ for callback, args, kwargs in _atstart:
2423+ callback(*args, **kwargs)
2424+ del _atstart[:]
2425+
2426+
2427+def _run_atexit():
2428+ '''Hook frameworks must invoke this after the main hook body has
2429+ successfully completed. Do not invoke it if the hook fails.'''
2430+ global _atexit
2431+ for callback, args, kwargs in reversed(_atexit):
2432+ callback(*args, **kwargs)
2433+ del _atexit[:]
2434
2435=== modified file 'hooks/charmhelpers/core/host.py'
2436--- hooks/charmhelpers/core/host.py 2015-04-03 15:42:21 +0000
2437+++ hooks/charmhelpers/core/host.py 2015-06-30 07:36:26 +0000
2438@@ -413,7 +413,7 @@
2439 the pkgcache argument is None. Be sure to add charmhelpers.fetch if
2440 you call this function, or pass an apt_pkg.Cache() instance.
2441 '''
2442- from apt import apt_pkg
2443+ import apt_pkg
2444 if not pkgcache:
2445 from charmhelpers.fetch import apt_cache
2446 pkgcache = apt_cache()
2447
2448=== modified file 'hooks/charmhelpers/core/services/base.py'
2449--- hooks/charmhelpers/core/services/base.py 2015-01-26 13:07:31 +0000
2450+++ hooks/charmhelpers/core/services/base.py 2015-06-30 07:36:26 +0000
2451@@ -15,8 +15,8 @@
2452 # along with charm-helpers. If not, see <http://www.gnu.org/licenses/>.
2453
2454 import os
2455-import re
2456 import json
2457+from inspect import getargspec
2458 from collections import Iterable, OrderedDict
2459
2460 from charmhelpers.core import host
2461@@ -128,15 +128,18 @@
2462 """
2463 Handle the current hook by doing The Right Thing with the registered services.
2464 """
2465- hook_name = hookenv.hook_name()
2466- if hook_name == 'stop':
2467- self.stop_services()
2468- else:
2469- self.provide_data()
2470- self.reconfigure_services()
2471- cfg = hookenv.config()
2472- if cfg.implicit_save:
2473- cfg.save()
2474+ hookenv._run_atstart()
2475+ try:
2476+ hook_name = hookenv.hook_name()
2477+ if hook_name == 'stop':
2478+ self.stop_services()
2479+ else:
2480+ self.reconfigure_services()
2481+ self.provide_data()
2482+ except SystemExit as x:
2483+ if x.code is None or x.code == 0:
2484+ hookenv._run_atexit()
2485+ hookenv._run_atexit()
2486
2487 def provide_data(self):
2488 """
2489@@ -145,15 +148,36 @@
2490 A provider must have a `name` attribute, which indicates which relation
2491 to set data on, and a `provide_data()` method, which returns a dict of
2492 data to set.
2493+
2494+ The `provide_data()` method can optionally accept two parameters:
2495+
2496+ * ``remote_service`` The name of the remote service that the data will
2497+ be provided to. The `provide_data()` method will be called once
2498+ for each connected service (not unit). This allows the method to
2499+ tailor its data to the given service.
2500+ * ``service_ready`` Whether or not the service definition had all of
2501+ its requirements met, and thus the ``data_ready`` callbacks run.
2502+
2503+ Note that the ``provided_data`` methods are now called **after** the
2504+ ``data_ready`` callbacks are run. This gives the ``data_ready`` callbacks
2505+ a chance to generate any data necessary for the providing to the remote
2506+ services.
2507 """
2508- hook_name = hookenv.hook_name()
2509- for service in self.services.values():
2510+ for service_name, service in self.services.items():
2511+ service_ready = self.is_ready(service_name)
2512 for provider in service.get('provided_data', []):
2513- if re.match(r'{}-relation-(joined|changed)'.format(provider.name), hook_name):
2514- data = provider.provide_data()
2515- _ready = provider._is_ready(data) if hasattr(provider, '_is_ready') else data
2516- if _ready:
2517- hookenv.relation_set(None, data)
2518+ for relid in hookenv.relation_ids(provider.name):
2519+ units = hookenv.related_units(relid)
2520+ if not units:
2521+ continue
2522+ remote_service = units[0].split('/')[0]
2523+ argspec = getargspec(provider.provide_data)
2524+ if len(argspec.args) > 1:
2525+ data = provider.provide_data(remote_service, service_ready)
2526+ else:
2527+ data = provider.provide_data()
2528+ if data:
2529+ hookenv.relation_set(relid, data)
2530
2531 def reconfigure_services(self, *service_names):
2532 """
2533
2534=== modified file 'hooks/charmhelpers/core/strutils.py'
2535--- hooks/charmhelpers/core/strutils.py 2015-02-27 05:36:25 +0000
2536+++ hooks/charmhelpers/core/strutils.py 2015-06-30 07:36:26 +0000
2537@@ -33,9 +33,9 @@
2538
2539 value = value.strip().lower()
2540
2541- if value in ['y', 'yes', 'true', 't']:
2542+ if value in ['y', 'yes', 'true', 't', 'on']:
2543 return True
2544- elif value in ['n', 'no', 'false', 'f']:
2545+ elif value in ['n', 'no', 'false', 'f', 'off']:
2546 return False
2547
2548 msg = "Unable to interpret string value '%s' as boolean" % (value)
2549
2550=== modified file 'hooks/charmhelpers/fetch/giturl.py'
2551--- hooks/charmhelpers/fetch/giturl.py 2015-02-18 14:24:32 +0000
2552+++ hooks/charmhelpers/fetch/giturl.py 2015-06-30 07:36:26 +0000
2553@@ -45,14 +45,16 @@
2554 else:
2555 return True
2556
2557- def clone(self, source, dest, branch):
2558+ def clone(self, source, dest, branch, depth=None):
2559 if not self.can_handle(source):
2560 raise UnhandledSource("Cannot handle {}".format(source))
2561
2562- repo = Repo.clone_from(source, dest)
2563- repo.git.checkout(branch)
2564+ if depth:
2565+ Repo.clone_from(source, dest, branch=branch, depth=depth)
2566+ else:
2567+ Repo.clone_from(source, dest, branch=branch)
2568
2569- def install(self, source, branch="master", dest=None):
2570+ def install(self, source, branch="master", dest=None, depth=None):
2571 url_parts = self.parse_url(source)
2572 branch_name = url_parts.path.strip("/").split("/")[-1]
2573 if dest:
2574@@ -63,7 +65,7 @@
2575 if not os.path.exists(dest_dir):
2576 mkdir(dest_dir, perms=0o755)
2577 try:
2578- self.clone(source, dest_dir, branch)
2579+ self.clone(source, dest_dir, branch, depth)
2580 except GitCommandError as e:
2581 raise UnhandledSource(e.message)
2582 except OSError as e:
2583
2584=== modified symlink 'hooks/cluster-relation-changed' (properties changed: -x to +x)
2585=== target was u'hooks.py'
2586--- hooks/cluster-relation-changed 1970-01-01 00:00:00 +0000
2587+++ hooks/cluster-relation-changed 2015-06-30 07:36:26 +0000
2588@@ -0,0 +1,20 @@
2589+#!/usr/bin/python3
2590+# Copyright 2015 Canonical Ltd.
2591+#
2592+# This file is part of the Cassandra Charm for Juju.
2593+#
2594+# This program is free software: you can redistribute it and/or modify
2595+# it under the terms of the GNU General Public License version 3, as
2596+# published by the Free Software Foundation.
2597+#
2598+# This program is distributed in the hope that it will be useful, but
2599+# WITHOUT ANY WARRANTY; without even the implied warranties of
2600+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
2601+# PURPOSE. See the GNU General Public License for more details.
2602+#
2603+# You should have received a copy of the GNU General Public License
2604+# along with this program. If not, see <http://www.gnu.org/licenses/>.
2605+import hooks
2606+if __name__ == '__main__':
2607+ hooks.bootstrap()
2608+ hooks.default_hook()
2609
2610=== modified symlink 'hooks/cluster-relation-departed' (properties changed: -x to +x)
2611=== target was u'hooks.py'
2612--- hooks/cluster-relation-departed 1970-01-01 00:00:00 +0000
2613+++ hooks/cluster-relation-departed 2015-06-30 07:36:26 +0000
2614@@ -0,0 +1,20 @@
2615+#!/usr/bin/python3
2616+# Copyright 2015 Canonical Ltd.
2617+#
2618+# This file is part of the Cassandra Charm for Juju.
2619+#
2620+# This program is free software: you can redistribute it and/or modify
2621+# it under the terms of the GNU General Public License version 3, as
2622+# published by the Free Software Foundation.
2623+#
2624+# This program is distributed in the hope that it will be useful, but
2625+# WITHOUT ANY WARRANTY; without even the implied warranties of
2626+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
2627+# PURPOSE. See the GNU General Public License for more details.
2628+#
2629+# You should have received a copy of the GNU General Public License
2630+# along with this program. If not, see <http://www.gnu.org/licenses/>.
2631+import hooks
2632+if __name__ == '__main__':
2633+ hooks.bootstrap()
2634+ hooks.default_hook()
2635
2636=== removed symlink 'hooks/cluster-relation-joined'
2637=== target was u'hooks.py'
2638=== modified symlink 'hooks/config-changed' (properties changed: -x to +x)
2639=== target was u'hooks.py'
2640--- hooks/config-changed 1970-01-01 00:00:00 +0000
2641+++ hooks/config-changed 2015-06-30 07:36:26 +0000
2642@@ -0,0 +1,20 @@
2643+#!/usr/bin/python3
2644+# Copyright 2015 Canonical Ltd.
2645+#
2646+# This file is part of the Cassandra Charm for Juju.
2647+#
2648+# This program is free software: you can redistribute it and/or modify
2649+# it under the terms of the GNU General Public License version 3, as
2650+# published by the Free Software Foundation.
2651+#
2652+# This program is distributed in the hope that it will be useful, but
2653+# WITHOUT ANY WARRANTY; without even the implied warranties of
2654+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
2655+# PURPOSE. See the GNU General Public License for more details.
2656+#
2657+# You should have received a copy of the GNU General Public License
2658+# along with this program. If not, see <http://www.gnu.org/licenses/>.
2659+import hooks
2660+if __name__ == '__main__':
2661+ hooks.bootstrap()
2662+ hooks.default_hook()
2663
2664=== added file 'hooks/coordinator.py'
2665--- hooks/coordinator.py 1970-01-01 00:00:00 +0000
2666+++ hooks/coordinator.py 2015-06-30 07:36:26 +0000
2667@@ -0,0 +1,35 @@
2668+# Copyright 2015 Canonical Ltd.
2669+#
2670+# This file is part of the Cassandra Charm for Juju.
2671+#
2672+# This program is free software: you can redistribute it and/or modify
2673+# it under the terms of the GNU General Public License version 3, as
2674+# published by the Free Software Foundation.
2675+#
2676+# This program is distributed in the hope that it will be useful, but
2677+# WITHOUT ANY WARRANTY; without even the implied warranties of
2678+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
2679+# PURPOSE. See the GNU General Public License for more details.
2680+#
2681+# You should have received a copy of the GNU General Public License
2682+# along with this program. If not, see <http://www.gnu.org/licenses/>.
2683+from charmhelpers.coordinator import BaseCoordinator
2684+
2685+
2686+class CassandraCoordinator(BaseCoordinator):
2687+ def default_grant(self, lock, unit, granted, queue):
2688+ '''Grant locks to only one unit at a time, regardless of its name.
2689+
2690+ This lets us keep separate locks like repair and restart,
2691+ while ensuring the operations do not occur on different nodes
2692+ at the same time.
2693+ '''
2694+ # Return True if this unit has already been granted a lock.
2695+ if self.grants.get(unit):
2696+ return True
2697+
2698+ # Otherwise, return True if the unit is first in the queue.
2699+ return queue[0] == unit and not granted
2700+
2701+
2702+coordinator = CassandraCoordinator()
2703
2704=== modified symlink 'hooks/data-relation-changed' (properties changed: -x to +x)
2705=== target was u'hooks.py'
2706--- hooks/data-relation-changed 1970-01-01 00:00:00 +0000
2707+++ hooks/data-relation-changed 2015-06-30 07:36:26 +0000
2708@@ -0,0 +1,20 @@
2709+#!/usr/bin/python3
2710+# Copyright 2015 Canonical Ltd.
2711+#
2712+# This file is part of the Cassandra Charm for Juju.
2713+#
2714+# This program is free software: you can redistribute it and/or modify
2715+# it under the terms of the GNU General Public License version 3, as
2716+# published by the Free Software Foundation.
2717+#
2718+# This program is distributed in the hope that it will be useful, but
2719+# WITHOUT ANY WARRANTY; without even the implied warranties of
2720+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
2721+# PURPOSE. See the GNU General Public License for more details.
2722+#
2723+# You should have received a copy of the GNU General Public License
2724+# along with this program. If not, see <http://www.gnu.org/licenses/>.
2725+import hooks
2726+if __name__ == '__main__':
2727+ hooks.bootstrap()
2728+ hooks.default_hook()
2729
2730=== modified symlink 'hooks/data-relation-departed' (properties changed: -x to +x)
2731=== target was u'hooks.py'
2732--- hooks/data-relation-departed 1970-01-01 00:00:00 +0000
2733+++ hooks/data-relation-departed 2015-06-30 07:36:26 +0000
2734@@ -0,0 +1,20 @@
2735+#!/usr/bin/python3
2736+# Copyright 2015 Canonical Ltd.
2737+#
2738+# This file is part of the Cassandra Charm for Juju.
2739+#
2740+# This program is free software: you can redistribute it and/or modify
2741+# it under the terms of the GNU General Public License version 3, as
2742+# published by the Free Software Foundation.
2743+#
2744+# This program is distributed in the hope that it will be useful, but
2745+# WITHOUT ANY WARRANTY; without even the implied warranties of
2746+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
2747+# PURPOSE. See the GNU General Public License for more details.
2748+#
2749+# You should have received a copy of the GNU General Public License
2750+# along with this program. If not, see <http://www.gnu.org/licenses/>.
2751+import hooks
2752+if __name__ == '__main__':
2753+ hooks.bootstrap()
2754+ hooks.default_hook()
2755
2756=== modified symlink 'hooks/database-admin-relation-changed' (properties changed: -x to +x)
2757=== target was u'hooks.py'
2758--- hooks/database-admin-relation-changed 1970-01-01 00:00:00 +0000
2759+++ hooks/database-admin-relation-changed 2015-06-30 07:36:26 +0000
2760@@ -0,0 +1,20 @@
2761+#!/usr/bin/python3
2762+# Copyright 2015 Canonical Ltd.
2763+#
2764+# This file is part of the Cassandra Charm for Juju.
2765+#
2766+# This program is free software: you can redistribute it and/or modify
2767+# it under the terms of the GNU General Public License version 3, as
2768+# published by the Free Software Foundation.
2769+#
2770+# This program is distributed in the hope that it will be useful, but
2771+# WITHOUT ANY WARRANTY; without even the implied warranties of
2772+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
2773+# PURPOSE. See the GNU General Public License for more details.
2774+#
2775+# You should have received a copy of the GNU General Public License
2776+# along with this program. If not, see <http://www.gnu.org/licenses/>.
2777+import hooks
2778+if __name__ == '__main__':
2779+ hooks.bootstrap()
2780+ hooks.default_hook()
2781
2782=== modified symlink 'hooks/database-relation-changed' (properties changed: -x to +x)
2783=== target was u'hooks.py'
2784--- hooks/database-relation-changed 1970-01-01 00:00:00 +0000
2785+++ hooks/database-relation-changed 2015-06-30 07:36:26 +0000
2786@@ -0,0 +1,20 @@
2787+#!/usr/bin/python3
2788+# Copyright 2015 Canonical Ltd.
2789+#
2790+# This file is part of the Cassandra Charm for Juju.
2791+#
2792+# This program is free software: you can redistribute it and/or modify
2793+# it under the terms of the GNU General Public License version 3, as
2794+# published by the Free Software Foundation.
2795+#
2796+# This program is distributed in the hope that it will be useful, but
2797+# WITHOUT ANY WARRANTY; without even the implied warranties of
2798+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
2799+# PURPOSE. See the GNU General Public License for more details.
2800+#
2801+# You should have received a copy of the GNU General Public License
2802+# along with this program. If not, see <http://www.gnu.org/licenses/>.
2803+import hooks
2804+if __name__ == '__main__':
2805+ hooks.bootstrap()
2806+ hooks.default_hook()
2807
2808=== modified file 'hooks/definitions.py'
2809--- hooks/definitions.py 2015-04-13 16:39:20 +0000
2810+++ hooks/definitions.py 2015-06-30 07:36:26 +0000
2811@@ -15,13 +15,11 @@
2812 # along with this program. If not, see <http://www.gnu.org/licenses/>.
2813
2814 from charmhelpers.core import hookenv
2815-from charmhelpers.core.hookenv import ERROR
2816 from charmhelpers.core import services
2817
2818 import actions
2819 import helpers
2820 import relations
2821-import rollingrestart
2822
2823
2824 def get_service_definitions():
2825@@ -35,32 +33,13 @@
2826 config = hookenv.config()
2827
2828 return [
2829- # Actions done before or while the Cassandra service is running.
2830- dict(service=helpers.get_cassandra_service(),
2831-
2832- # Open access to client and replication ports. Client
2833- # protocols require password authentication. Access to
2834- # the unauthenticated replication ports is protected via
2835- # ufw firewall rules. We do not open the JMX port, although
2836- # we could since it is similarly protected by ufw.
2837- ports=[config['rpc_port'], # Thrift clients
2838- config['native_transport_port'], # Native clients.
2839- config['storage_port'], # Plaintext replication
2840- config['ssl_storage_port']], # Encrypted replication.
2841-
2842- required_data=[relations.StorageRelation()],
2843- provided_data=[relations.StorageRelation()],
2844+ # Prepare for the Cassandra service.
2845+ dict(service='install',
2846 data_ready=[actions.set_proxy,
2847 actions.preinstall,
2848 actions.emit_meminfo,
2849 actions.revert_unchangeable_config,
2850 actions.store_unit_private_ip,
2851- actions.set_unit_zero_bootstrapped,
2852- actions.shutdown_before_joining_peers,
2853- # Must open ports before attempting bind to the
2854- # public ip address.
2855- actions.configure_firewall,
2856- actions.grant_ssh_access,
2857 actions.add_implicit_package_signing_keys,
2858 actions.configure_sources,
2859 actions.swapoff,
2860@@ -68,65 +47,75 @@
2861 actions.install_oracle_jre,
2862 actions.install_cassandra_packages,
2863 actions.emit_java_version,
2864- actions.ensure_cassandra_package_status,
2865+ actions.ensure_cassandra_package_status],
2866+ start=[], stop=[]),
2867+
2868+ # Get Cassandra running.
2869+ dict(service=helpers.get_cassandra_service(),
2870+
2871+ # Open access to client and replication ports. Client
2872+ # protocols require password authentication. Access to
2873+ # the unauthenticated replication ports is protected via
2874+ # ufw firewall rules. We do not open the JMX port, although
2875+ # we could since it is similarly protected by ufw.
2876+ ports=[config['rpc_port'], # Thrift clients
2877+ config['native_transport_port'], # Native clients.
2878+ config['storage_port'], # Plaintext replication
2879+ config['ssl_storage_port']], # Encrypted replication.
2880+
2881+ required_data=[relations.StorageRelation(),
2882+ relations.PeerRelation()],
2883+ provided_data=[relations.StorageRelation()],
2884+ data_ready=[actions.configure_firewall,
2885+ actions.maintain_seeds,
2886 actions.configure_cassandra_yaml,
2887 actions.configure_cassandra_env,
2888 actions.configure_cassandra_rackdc,
2889 actions.reset_all_io_schedulers,
2890- actions.nrpe_external_master_relation,
2891- actions.maybe_schedule_restart],
2892+ actions.maybe_restart,
2893+ actions.request_unit_superuser,
2894+ actions.reset_default_password],
2895 start=[services.open_ports],
2896 stop=[actions.stop_cassandra, services.close_ports]),
2897
2898- # Rolling restart. This service will call the restart hook when
2899- # it is this units turn to restart. This is also where we do
2900- # actions done while Cassandra is not running, and where we do
2901- # actions that should only be done by one node at a time.
2902- rollingrestart.make_service([helpers.stop_cassandra,
2903- helpers.remount_cassandra,
2904- helpers.ensure_database_directories,
2905- helpers.pre_bootstrap,
2906- helpers.start_cassandra,
2907- helpers.post_bootstrap,
2908- helpers.wait_for_agreed_schema,
2909- helpers.wait_for_normality,
2910- helpers.emit_describe_cluster,
2911- helpers.reset_default_password,
2912- helpers.ensure_unit_superuser,
2913- helpers.reset_auth_keyspace_replication]),
2914-
2915 # Actions that must be done while Cassandra is running.
2916 dict(service='post',
2917 required_data=[RequiresLiveNode()],
2918- data_ready=[actions.publish_database_relations,
2919+ data_ready=[actions.post_bootstrap,
2920+ actions.create_unit_superusers,
2921+ actions.reset_auth_keyspace_replication,
2922+ actions.publish_database_relations,
2923 actions.publish_database_admin_relations,
2924 actions.install_maintenance_crontab,
2925- actions.emit_describe_cluster,
2926- actions.emit_auth_keyspace_status,
2927- actions.emit_netstats],
2928+ actions.nrpe_external_master_relation,
2929+ actions.emit_cluster_info,
2930+ actions.set_active],
2931 start=[], stop=[])]
2932
2933
2934 class RequiresLiveNode:
2935 def __bool__(self):
2936- return self.is_live()
2937+ is_live = self.is_live()
2938+ hookenv.log('Requirement RequiresLiveNode: {}'.format(is_live),
2939+ hookenv.DEBUG)
2940+ return is_live
2941
2942 def is_live(self):
2943+ if helpers.is_decommissioned():
2944+ hookenv.log('Node is decommissioned')
2945+ return False
2946+
2947 if helpers.is_cassandra_running():
2948- if helpers.is_decommissioned():
2949- # Node is decommissioned and will refuse to talk.
2950- hookenv.log('Node is decommissioned')
2951- return False
2952- try:
2953- with helpers.connect():
2954- hookenv.log("Node live and authentication working")
2955- return True
2956- except Exception as x:
2957- hookenv.log(
2958- 'Unable to connect as superuser: {}'.format(str(x)),
2959- ERROR)
2960- return False
2961- return False
2962+ hookenv.log('Cassandra is running')
2963+ if hookenv.local_unit() in helpers.get_unit_superusers():
2964+ hookenv.log('Credentials created')
2965+ return True
2966+ else:
2967+ hookenv.log('Credentials have not been created')
2968+ return False
2969+ else:
2970+ hookenv.log('Cassandra is not running')
2971+ return False
2972
2973
2974 def get_service_manager():
2975
2976=== modified file 'hooks/helpers.py'
2977--- hooks/helpers.py 2015-04-17 10:06:34 +0000
2978+++ hooks/helpers.py 2015-06-30 07:36:26 +0000
2979@@ -13,14 +13,12 @@
2980 #
2981 # You should have received a copy of the GNU General Public License
2982 # along with this program. If not, see <http://www.gnu.org/licenses/>.
2983-
2984 import configparser
2985 from contextlib import contextmanager
2986 from datetime import timedelta
2987 import errno
2988 from functools import wraps
2989 import io
2990-from itertools import chain
2991 import json
2992 import os.path
2993 import re
2994@@ -37,13 +35,11 @@
2995 import cassandra.query
2996 import yaml
2997
2998-from charmhelpers.contrib import unison
2999 from charmhelpers.core import hookenv, host
3000 from charmhelpers.core.hookenv import DEBUG, ERROR, WARNING
3001 from charmhelpers import fetch
3002
3003-import relations
3004-import rollingrestart
3005+from coordinator import coordinator
3006
3007
3008 RESTART_TIMEOUT = 600
3009@@ -101,10 +97,9 @@
3010 if hookenv.config('extra_packages'):
3011 packages.extend(hookenv.config('extra_packages').split())
3012 packages = fetch.filter_installed_packages(packages)
3013- # if 'ntp' in packages:
3014- # fetch.apt_install(['ntp'], fatal=True) # With autostart
3015- # packages.remove('ntp')
3016 if packages:
3017+ # The DSE packages are huge, so this might take some time.
3018+ status_set('maintenance', 'Installing packages')
3019 with autostart_disabled(['cassandra']):
3020 fetch.apt_install(packages, fatal=True)
3021
3022@@ -128,26 +123,13 @@
3023 dpkg.communicate(input=''.join(selections).encode('US-ASCII'))
3024
3025
3026-def seed_ips():
3027+def get_seed_ips():
3028 '''Return the set of seed ip addresses.
3029
3030- If this is the only unit in the service, then the local unit's IP
3031- address is returned as the only seed.
3032-
3033- If there is more than one unit in the service, then the IP addresses
3034- of three lowest numbered peers (if bootstrapped) are returned.
3035+ We use ip addresses rather than unit names, as we may need to use
3036+ external seed ips at some point.
3037 '''
3038- peers = rollingrestart.get_peers()
3039- if not peers:
3040- return set([hookenv.unit_private_ip()])
3041-
3042- # The three lowest numbered units are seeds & may include this unit.
3043- seeds = sorted(list(peers) + [hookenv.local_unit()],
3044- key=lambda x: int(x.split('/')[-1]))[:3]
3045-
3046- relid = rollingrestart.get_peer_relation_id()
3047- return set(hookenv.relation_get('private-address', seed, relid)
3048- for seed in seeds)
3049+ return set((hookenv.leader_get('seeds') or '').split(','))
3050
3051
3052 def actual_seed_ips():
3053@@ -163,6 +145,7 @@
3054 Entries in the config file may be absolute, relative to
3055 /var/lib/cassandra, or relative to the mountpoint.
3056 '''
3057+ import relations
3058 storage = relations.StorageRelation()
3059 if storage.mountpoint:
3060 root = os.path.join(storage.mountpoint, 'cassandra')
3061@@ -177,9 +160,6 @@
3062
3063 Returns the absolute path.
3064 '''
3065- # Guard against changing perms on a running db. Although probably
3066- # harmless, it causes shutil.chown() to fail.
3067- assert not is_cassandra_running()
3068 absdir = get_database_directory(config_path)
3069
3070 # Work around Bug #1427150 by ensuring components of the path are
3071@@ -322,9 +302,7 @@
3072
3073 def get_cassandra_version():
3074 if get_cassandra_edition() == 'dse':
3075- # When we support multiple versions, we will need to map
3076- # DataStax versions to Cassandra versions.
3077- return '2.0' if get_package_version('dse-full') else None
3078+ return '2.1' if get_package_version('dse-full') else None
3079 return get_package_version('cassandra')
3080
3081
3082@@ -352,8 +330,6 @@
3083 edition = get_cassandra_edition()
3084 if edition == 'dse':
3085 pid_file = "/var/run/dse/dse.pid"
3086- # elif apt_pkg.version_compare(get_cassandra_version(), "2.0") < 0:
3087- # pid_file = "/var/run/cassandra.pid"
3088 else:
3089 pid_file = "/var/run/cassandra/cassandra.pid"
3090 return pid_file
3091@@ -377,20 +353,20 @@
3092 # agreement.
3093 pass
3094 else:
3095+ # NB. OpenJDK 8 not available in trusty.
3096 packages.add('openjdk-7-jre-headless')
3097
3098 return packages
3099
3100
3101 @logged
3102-def stop_cassandra(immediate=False):
3103+def stop_cassandra():
3104 if is_cassandra_running():
3105- if not immediate:
3106- # If there are cluster operations in progress, wait until
3107- # they are complete before restarting. This might take days.
3108- wait_for_normality()
3109+ hookenv.log('Shutting down Cassandra')
3110 host.service_stop(get_cassandra_service())
3111- assert not is_cassandra_running()
3112+ if is_cassandra_running():
3113+ hookenv.status_set('blocked', 'Cassandra failed to shut down')
3114+ raise SystemExit(0)
3115
3116
3117 @logged
3118@@ -398,9 +374,19 @@
3119 if is_cassandra_running():
3120 return
3121
3122- actual_seeds = actual_seed_ips()
3123+ actual_seeds = sorted(actual_seed_ips())
3124 assert actual_seeds, 'Attempting to start cassandra with empty seed list'
3125- hookenv.log('Starting Cassandra with seeds {!r}'.format(actual_seeds))
3126+ hookenv.config()['configured_seeds'] = actual_seeds
3127+
3128+ if is_bootstrapped():
3129+ status_set('maintenance',
3130+ 'Starting Cassandra with seeds {!r}'
3131+ .format(','.join(actual_seeds)))
3132+ else:
3133+ status_set('maintenance',
3134+ 'Bootstrapping with seeds {}'
3135+ .format(','.join(actual_seeds)))
3136+
3137 host.service_start(get_cassandra_service())
3138
3139 # Wait for Cassandra to actually start, or abort.
3140@@ -409,35 +395,8 @@
3141 if is_cassandra_running():
3142 return
3143 time.sleep(1)
3144- hookenv.log('Cassandra failed to start.', ERROR)
3145- raise SystemExit(1)
3146-
3147-
3148-def is_responding(ip, port=None, timeout=5):
3149- if port is None:
3150- port = hookenv.config()['storage_port']
3151- try:
3152- subprocess.check_output(['nc', '-zw', str(timeout), ip, str(port)],
3153- stderr=subprocess.STDOUT)
3154- hookenv.log('{} listening on port {}'.format(ip, port), DEBUG)
3155- return True
3156- except subprocess.TimeoutExpired:
3157- hookenv.log('{} not listening on port {}'.format(ip, port), DEBUG)
3158- return False
3159-
3160-
3161-@logged
3162-def are_all_nodes_responding():
3163- all_contactable = True
3164- for ip in node_ips():
3165- if ip == hookenv.unit_private_ip():
3166- pass
3167- elif is_responding(ip, timeout=5):
3168- hookenv.log('{} is responding'.format(ip))
3169- else:
3170- all_contactable = False
3171- hookenv.log('{} is not responding'.format(ip))
3172- return all_contactable
3173+ status_set('blocked', 'Cassandra failed to start')
3174+ raise SystemExit(0)
3175
3176
3177 @logged
3178@@ -451,8 +410,10 @@
3179 def remount_cassandra():
3180 '''If a new mountpoint is ready, migrate data across to it.'''
3181 assert not is_cassandra_running() # Guard against data loss.
3182+ import relations
3183 storage = relations.StorageRelation()
3184 if storage.needs_remount():
3185+ status_set('maintenance', 'Migrating data to new mountpoint')
3186 hookenv.config()['bootstrapped_into_cluster'] = False
3187 if storage.mountpoint is None:
3188 hookenv.log('External storage AND DATA gone. '
3189@@ -468,6 +429,9 @@
3190 @logged
3191 def ensure_database_directories():
3192 '''Ensure that directories Cassandra expects to store its data in exist.'''
3193+ # Guard against changing perms on a running db. Although probably
3194+ # harmless, it causes shutil.chown() to fail.
3195+ assert not is_cassandra_running()
3196 db_dirs = get_all_database_directories()
3197 unpacked_db_dirs = (db_dirs['data_file_directories'] +
3198 [db_dirs['commitlog_directory']] +
3199@@ -476,28 +440,7 @@
3200 ensure_database_directory(db_dir)
3201
3202
3203-@logged
3204-def reset_default_password():
3205- config = hookenv.config()
3206- if config.get('default_admin_password_changed', False):
3207- hookenv.log('Default admin password changed in an earlier hook')
3208- return
3209-
3210- # If we can connect using the default superuser password
3211- # 'cassandra', change it to something random. We do this in a loop
3212- # to ensure the hook that calls this is idempotent.
3213- try:
3214- with connect('cassandra', 'cassandra') as session:
3215- hookenv.log('Changing default admin password')
3216- query(session, 'ALTER USER cassandra WITH PASSWORD %s',
3217- ConsistencyLevel.QUORUM, (host.pwgen(),))
3218- except cassandra.AuthenticationFailed:
3219- hookenv.log('Default admin password already changed')
3220-
3221- config['default_admin_password_changed'] = True
3222-
3223-
3224-CONNECT_TIMEOUT = 300
3225+CONNECT_TIMEOUT = 10
3226
3227
3228 @contextmanager
3229@@ -568,72 +511,45 @@
3230 raise
3231
3232
3233-def ensure_user(username, password, superuser=False):
3234+def encrypt_password(password):
3235+ return bcrypt.hashpw(password, bcrypt.gensalt())
3236+
3237+
3238+@logged
3239+def ensure_user(session, username, encrypted_password, superuser=False):
3240 '''Create the DB user if it doesn't already exist & reset the password.'''
3241 if superuser:
3242 hookenv.log('Creating SUPERUSER {}'.format(username))
3243- sup = 'SUPERUSER'
3244 else:
3245 hookenv.log('Creating user {}'.format(username))
3246- sup = 'NOSUPERUSER'
3247- with connect() as session:
3248- query(session,
3249- 'CREATE USER IF NOT EXISTS %s '
3250- 'WITH PASSWORD %s {}'.format(sup),
3251- ConsistencyLevel.QUORUM, (username, password,))
3252- query(session, 'ALTER USER %s WITH PASSWORD %s {}'.format(sup),
3253- ConsistencyLevel.QUORUM, (username, password,))
3254-
3255-
3256-@logged
3257-def ensure_unit_superuser():
3258- '''If the unit's superuser account is not working, recreate it.'''
3259- try:
3260- with connect(auth_timeout=10):
3261- hookenv.log('Unit superuser account already setup', DEBUG)
3262- return
3263- except cassandra.AuthenticationFailed:
3264- pass
3265-
3266- create_unit_superuser() # Doesn't exist or can't access, so create it.
3267-
3268- with connect():
3269- hookenv.log('Unit superuser password reset successful')
3270-
3271-
3272-@logged
3273-def create_unit_superuser():
3274+ query(session,
3275+ 'INSERT INTO system_auth.users (name, super) VALUES (%s, %s)',
3276+ ConsistencyLevel.ALL, (username, superuser))
3277+ query(session,
3278+ 'INSERT INTO system_auth.credentials (username, salted_hash) '
3279+ 'VALUES (%s, %s)',
3280+ ConsistencyLevel.ALL, (username, encrypted_password))
3281+
3282+
3283+@logged
3284+def create_unit_superuser_hard():
3285 '''Create or recreate the unit's superuser account.
3286
3287- As there may be no known superuser credentials to use, we restart
3288- the node using the AllowAllAuthenticator and insert our user
3289- directly into the system_auth keyspace.
3290+ This method is used when there are no known superuser credentials
3291+ to use. We restart the node using the AllowAllAuthenticator and
3292+ insert our credentials directly into the system_auth keyspace.
3293 '''
3294 username, password = superuser_credentials()
3295+ pwhash = encrypt_password(password)
3296 hookenv.log('Creating unit superuser {}'.format(username))
3297
3298 # Restart cassandra without authentication & listening on localhost.
3299- wait_for_normality()
3300 reconfigure_and_restart_cassandra(
3301 dict(authenticator='AllowAllAuthenticator', rpc_address='localhost'))
3302- wait_for_normality()
3303- emit_cluster_info()
3304 for _ in backoff('superuser creation'):
3305 try:
3306 with connect() as session:
3307- pwhash = bcrypt.hashpw(password,
3308- bcrypt.gensalt()) # Cassandra 2.1
3309- statement = dedent('''\
3310- INSERT INTO system_auth.users (name, super)
3311- VALUES (%s, TRUE)
3312- ''')
3313- query(session, statement, ConsistencyLevel.QUORUM, (username,))
3314- statement = dedent('''\
3315- INSERT INTO system_auth.credentials (username, salted_hash)
3316- VALUES (%s, %s)
3317- ''')
3318- query(session, statement,
3319- ConsistencyLevel.QUORUM, (username, pwhash))
3320+ ensure_user(session, username, pwhash, superuser=True)
3321 break
3322 except Exception as x:
3323 print(str(x))
3324@@ -641,7 +557,6 @@
3325 # Restart Cassandra with regular config.
3326 nodetool('flush') # Ensure our backdoor updates are flushed.
3327 reconfigure_and_restart_cassandra()
3328- wait_for_normality()
3329
3330
3331 def get_cqlshrc_path():
3332@@ -703,12 +618,8 @@
3333 sys.stdout.flush()
3334
3335
3336-def nodetool(*cmd, ip=None, timeout=120):
3337+def nodetool(*cmd, timeout=120):
3338 cmd = ['nodetool'] + [str(i) for i in cmd]
3339- if ip is not None:
3340- assert ip in unison.collect_authed_hosts(
3341- rollingrestart.get_peer_relation_name())
3342- cmd = ['sudo', '-Hsu', 'juju_ssh', 'ssh', ip] + cmd
3343 i = 0
3344 until = time.time() + timeout
3345 for _ in backoff('nodetool to work'):
3346@@ -734,55 +645,16 @@
3347 except subprocess.CalledProcessError as x:
3348 if i > 1:
3349 emit(x.output.expandtabs()) # Expand tabs for juju debug-log.
3350+ if not is_cassandra_running():
3351+ status_set('blocked',
3352+ 'Cassandra has unexpectedly shutdown')
3353+ raise SystemExit(0)
3354 if time.time() >= until:
3355 raise
3356
3357
3358-def node_ips():
3359- '''IP addresses of all nodes in the Cassandra cluster.'''
3360- if is_bootstrapped() or num_peers() == 0:
3361- # If bootstrapped, or standalone, we trust our local info.
3362- raw = nodetool('status', 'system_auth')
3363- else:
3364- # If not bootstrapped, we query a seed.
3365- authed_ips = unison.collect_authed_hosts(
3366- rollingrestart.get_peer_relation_name())
3367- seeds = [ip for ip in seed_ips() if ip in authed_ips]
3368- if not seeds:
3369- return set() # Not bootstrapped, and nobody else is either.
3370- raw = nodetool('status', 'system_auth', ip=seeds[0])
3371-
3372- ips = set()
3373- for line in raw.splitlines():
3374- match = re.search(r'^(\w)([NLJM])\s+([\d\.]+)\s', line)
3375- if match is not None and (match.group(1) == 'U'
3376- or match.group(2) != 'L'):
3377- ips.add(match.group(3))
3378- return ips
3379-
3380-
3381-def up_node_ips():
3382- '''IP addresses of nodes that are up.'''
3383- raw = nodetool('status', 'system_auth')
3384- ips = set()
3385- for line in raw.splitlines():
3386- match = re.search(r'^(\w)([NLJM])\s+([\d\.]+)\s', line)
3387- if match is not None and match.group(1) == 'U': # Up
3388- ips.add(match.group(3))
3389- return ips
3390-
3391-
3392 def num_nodes():
3393- '''Number of nodes in the Cassandra cluster.
3394-
3395- This is not necessarily the same as the number of peers, as nodes
3396- may be decommissioned.
3397- '''
3398- return len(node_ips())
3399-
3400-
3401-def num_peers():
3402- return len(rollingrestart.get_peers())
3403+ return len(get_bootstrapped_ips())
3404
3405
3406 def read_cassandra_yaml():
3407@@ -819,7 +691,8 @@
3408 'storage_port', 'ssl_storage_port']
3409 cassandra_yaml.update((k, config[k]) for k in simple_config_keys)
3410
3411- seeds = ','.join(seeds or seed_ips()) # Don't include whitespace!
3412+ seeds = ','.join(seeds or get_seed_ips()) # Don't include whitespace!
3413+ hookenv.log('Configuring seeds as {!r}'.format(seeds), DEBUG)
3414 assert seeds, 'Attempting to configure cassandra with empty seed list'
3415 cassandra_yaml['seed_provider'][0]['parameters'][0]['seeds'] = seeds
3416
3417@@ -892,36 +765,6 @@
3418 return False
3419
3420
3421-@logged
3422-def reset_all_io_schedulers():
3423- dirs = get_all_database_directories()
3424- dirs = (dirs['data_file_directories'] + [dirs['commitlog_directory']] +
3425- [dirs['saved_caches_directory']])
3426- config = hookenv.config()
3427- for d in dirs:
3428- if os.path.isdir(d): # Directory may not exist yet.
3429- set_io_scheduler(config['io_scheduler'], d)
3430-
3431-
3432-@logged
3433-def reset_auth_keyspace_replication():
3434- # Cassandra requires you to manually set the replication factor of
3435- # the system_auth keyspace, to ensure availability and redundancy.
3436- n = max(min(num_nodes(), 3), 1) # Cap at 1-3 replicas.
3437- datacenter = hookenv.config()['datacenter']
3438- with connect() as session:
3439- strategy_opts = get_auth_keyspace_replication(session)
3440- rf = int(strategy_opts.get(datacenter, -1))
3441- hookenv.log('system_auth rf={!r}'.format(strategy_opts))
3442- if rf != n:
3443- strategy_opts['class'] = 'NetworkTopologyStrategy'
3444- strategy_opts[datacenter] = n
3445- if 'replication_factor' in strategy_opts:
3446- del strategy_opts['replication_factor']
3447- set_auth_keyspace_replication(session, strategy_opts)
3448- repair_auth_keyspace()
3449-
3450-
3451 def get_auth_keyspace_replication(session):
3452 statement = dedent('''\
3453 SELECT strategy_options FROM system.schema_keyspaces
3454@@ -931,321 +774,96 @@
3455 return json.loads(r[0][0])
3456
3457
3458+@logged
3459 def set_auth_keyspace_replication(session, settings):
3460- wait_for_normality()
3461- hookenv.log('Updating system_auth rf={!r}'.format(settings))
3462+ # Live operation, so keep status the same.
3463+ status_set(hookenv.status_get(),
3464+ 'Updating system_auth rf to {!r}'.format(settings))
3465 statement = 'ALTER KEYSPACE system_auth WITH REPLICATION = %s'
3466- query(session, statement, ConsistencyLevel.QUORUM, (settings,))
3467+ query(session, statement, ConsistencyLevel.ALL, (settings,))
3468
3469
3470 @logged
3471 def repair_auth_keyspace():
3472- # First, wait for schema agreement. Attempting to repair a keyspace
3473- # with inconsistent replication settings will fail.
3474- wait_for_agreed_schema()
3475 # Repair takes a long time, and may need to be retried due to 'snapshot
3476 # creation' errors, but should certainly complete within an hour since
3477 # the keyspace is tiny.
3478+ status_set(hookenv.status_get(),
3479+ 'Repairing system_auth keyspace')
3480 nodetool('repair', 'system_auth', timeout=3600)
3481
3482
3483-def non_system_keyspaces():
3484- # If there are only system keyspaces defined, there is no data we
3485- # may want to preserve and we can safely proceed with the bootstrap.
3486- # This should always be the case, but it is worth checking for weird
3487- # situations such as reusing an existing external mount without
3488- # clearing its data.
3489- dfds = get_all_database_directories()['data_file_directories']
3490- keyspaces = set(chain(*[os.listdir(dfd) for dfd in dfds]))
3491- hookenv.log('keyspaces={!r}'.format(keyspaces), DEBUG)
3492- return keyspaces - set(['system', 'system_auth', 'system_traces',
3493- 'dse_system'])
3494-
3495-
3496-def nuke_local_database():
3497- '''Destroy the local database, entirely, so this node may bootstrap.
3498-
3499- This needs to be lower level than just removing selected keyspaces
3500- such as 'system', as commitlogs and other crumbs can also cause the
3501- bootstrap process to fail disasterously.
3502- '''
3503- # This function is dangerous enough to warrent a guard.
3504- assert not is_bootstrapped()
3505- assert unit_number() != 0
3506-
3507- keyspaces = non_system_keyspaces()
3508- if keyspaces:
3509- hookenv.log('Non-system keyspaces {!r} detected. '
3510- 'Unable to bootstrap.'.format(keyspaces), ERROR)
3511- raise SystemExit(1)
3512-
3513- dirs = get_all_database_directories()
3514- nuke_directory_contents(dirs['saved_caches_directory'])
3515- nuke_directory_contents(dirs['commitlog_directory'])
3516- for dfd in dirs['data_file_directories']:
3517- nuke_directory_contents(dfd)
3518-
3519-
3520-def nuke_directory_contents(d):
3521- '''Remove the contents of directory d, leaving d in place.
3522-
3523- We don't remove the top level directory as it may be a mount
3524- or symlink we cannot recreate.
3525- '''
3526- for name in os.listdir(d):
3527- path = os.path.join(d, name)
3528- if os.path.isdir(path):
3529- shutil.rmtree(path)
3530- else:
3531- os.remove(path)
3532-
3533-
3534-def unit_number(unit=None):
3535- if unit is None:
3536- unit = hookenv.local_unit()
3537- return int(unit.split('/')[-1])
3538-
3539-
3540 def is_bootstrapped(unit=None):
3541 '''Return True if the node has already bootstrapped into the cluster.'''
3542- if unit is None:
3543- unit = hookenv.local_unit()
3544- if unit.endswith('/0'):
3545- return True
3546- relid = rollingrestart.get_peer_relation_id()
3547- if not relid:
3548+ if unit is None or unit == hookenv.local_unit():
3549+ return hookenv.config().get('bootstrapped', False)
3550+ elif coordinator.relid:
3551+ return bool(hookenv.relation_get(rid=coordinator.relid,
3552+ unit=unit).get('bootstrapped'))
3553+ else:
3554 return False
3555- return hookenv.relation_get('bootstrapped', unit, relid) == '1'
3556-
3557-
3558-def set_bootstrapped(flag):
3559- relid = rollingrestart.get_peer_relation_id()
3560- if bool(flag) is not is_bootstrapped():
3561- hookenv.log('Setting bootstrapped to {}'.format(flag))
3562-
3563- if flag:
3564- hookenv.relation_set(relid, bootstrapped="1")
3565- else:
3566- hookenv.relation_set(relid, bootstrapped=None)
3567-
3568-
3569-def bootstrapped_peers():
3570- '''Return the ordered list of bootstrapped peers'''
3571- peer_relid = rollingrestart.get_peer_relation_id()
3572- return [peer for peer in rollingrestart.get_peers()
3573- if hookenv.relation_get('bootstrapped', peer, peer_relid) == '1']
3574-
3575-
3576-def unbootstrapped_peers():
3577- '''Return the ordered list of unbootstrapped peers'''
3578- peer_relid = rollingrestart.get_peer_relation_id()
3579- return [peer for peer in rollingrestart.get_peers()
3580- if hookenv.relation_get('bootstrapped', peer, peer_relid) != '1']
3581-
3582-
3583-@logged
3584-def pre_bootstrap():
3585- """If we are about to bootstrap a node, prepare.
3586-
3587- Only the first node in the cluster is not bootstrapped. All other
3588- nodes added to the cluster need to bootstrap. To bootstrap, the local
3589- node needs to be shutdown, the database needs to be completely reset,
3590- and the node restarted with a valid seed_node.
3591-
3592- Until juju gains the necessary features, the best we can do is assume
3593- that Unit 0 should be the initial seed node. If the service has been
3594- running for some time and never had a second unit added, it is almost
3595- certain that it is Unit 0 that contains data we want to keep. This
3596- assumption is false if you have removed the only unit in the service
3597- at some point, so don't do that.
3598- """
3599- if is_bootstrapped():
3600- hookenv.log("Already bootstrapped")
3601- return
3602-
3603- if num_peers() == 0:
3604- hookenv.log("No peers, no cluster, no bootstrapping")
3605- return
3606-
3607- if not seed_ips():
3608- hookenv.log("No seeds available. Deferring bootstrap.")
3609- raise rollingrestart.DeferRestart()
3610-
3611- # Don't attempt to bootstrap until all lower numbered units have
3612- # bootstrapped. We need to do this as the rollingrestart algorithm
3613- # fails during initial cluster setup, where two or more newly joined
3614- # units may restart (and bootstrap) at the same time and exceed
3615- # Cassandra's limits on the number of nodes that may bootstrap
3616- # simultaneously. In addition, this ensures we wait until there is
3617- # at least one seed bootstrapped, as the three first units will be
3618- # seeds.
3619- for peer in unbootstrapped_peers():
3620- if unit_number(peer) < unit_number():
3621- hookenv.log("{} is not bootstrapped. Deferring.".format(peer))
3622- raise rollingrestart.DeferRestart()
3623-
3624- # Don't attempt to bootstrap until all bootstrapped peers have
3625- # authorized us.
3626- peer_relname = rollingrestart.get_peer_relation_name()
3627- peer_relid = rollingrestart.get_peer_relation_id()
3628- authed_ips = unison.collect_authed_hosts(peer_relname)
3629- for peer in bootstrapped_peers():
3630- peer_ip = hookenv.relation_get('private-address', peer, peer_relid)
3631- if peer_ip not in authed_ips:
3632- hookenv.log("{} has not authorized us. Deferring.".format(peer))
3633- raise rollingrestart.DeferRestart()
3634-
3635- # Bootstrap fail if we haven't yet opened our ports to all
3636- # the bootstrapped nodes. This is the case if we have not yet
3637- # joined the peer relationship with the node's unit.
3638- missing = node_ips() - peer_ips() - set([hookenv.unit_private_ip()])
3639- if missing:
3640- hookenv.log("Not yet in a peer relationship with {!r}. "
3641- "Deferring bootstrap".format(missing))
3642- raise rollingrestart.DeferRestart()
3643-
3644- # Bootstrap will fail if all the nodes that contain data that needs
3645- # to move to the new node are down. If you are storing data in
3646- # rf==1, bootstrap can fail if a single node is not contactable. As
3647- # far as this charm is concerned, we should not attempt bootstrap
3648- # until all the nodes in the cluster and contactable. This also
3649- # ensures that peers have run their relation-changed hook and
3650- # opened their firewall ports to the new unit.
3651- if not are_all_nodes_responding():
3652- raise rollingrestart.DeferRestart()
3653-
3654- hookenv.log('Joining cluster and need to bootstrap.')
3655-
3656- nuke_local_database()
3657-
3658-
3659-@logged
3660-def post_bootstrap():
3661- '''Maintain state on if the node has bootstrapped into the cluster.
3662-
3663- Per documented procedure for adding new units to a cluster, wait 2
3664- minutes if the unit has just bootstrapped to ensure no other.
3665+
3666+
3667+def set_bootstrapped():
3668+ # We need to store this flag in two locations. The peer relation,
3669+ # so peers can see it, and local state, for when we haven't joined
3670+ # the peer relation yet. actions.publish_bootstrapped_flag()
3671+ # calls this method again when necessary to ensure that state is
3672+ # propagated # if/when the peer relation is joined.
3673+ config = hookenv.config()
3674+ config['bootstrapped'] = True
3675+ if coordinator.relid is not None:
3676+ hookenv.relation_set(coordinator.relid, bootstrapped="1")
3677+ if config.changed('bootstrapped'):
3678+ hookenv.log('Bootstrapped')
3679+ else:
3680+ hookenv.log('Already bootstrapped')
3681+
3682+
3683+def get_bootstrapped():
3684+ units = [hookenv.local_unit()]
3685+ if coordinator.relid is not None:
3686+ units.extend(hookenv.related_units(coordinator.relid))
3687+ return set([unit for unit in units if is_bootstrapped(unit)])
3688+
3689+
3690+def get_bootstrapped_ips():
3691+ return set([unit_to_ip(unit) for unit in get_bootstrapped()])
3692+
3693+
3694+def unit_to_ip(unit):
3695+ if unit is None or unit == hookenv.local_unit():
3696+ return hookenv.unit_private_ip()
3697+ elif coordinator.relid:
3698+ return hookenv.relation_get(rid=coordinator.relid,
3699+ unit=unit).get('private-address')
3700+ else:
3701+ return None
3702+
3703+
3704+def get_node_status():
3705+ '''Return the Cassandra node status.
3706+
3707+ May be NORMAL, JOINING, DECOMMISSIONED etc., or None if we can't tell.
3708 '''
3709- if num_peers() == 0:
3710- # There is no cluster (just us), so we are not bootstrapped into
3711- # the cluster.
3712- return
3713-
3714- if not is_bootstrapped():
3715- for _ in backoff('bootstrap to finish'):
3716- if not is_cassandra_running():
3717- hookenv.log('Bootstrap failed. Retrying')
3718- start_cassandra()
3719- break
3720- if is_all_normal():
3721- hookenv.log('Cluster settled after bootstrap.')
3722- config = hookenv.config()
3723- hookenv.log('Waiting {}s.'.format(
3724- config['post_bootstrap_delay']))
3725- time.sleep(config['post_bootstrap_delay'])
3726- if is_cassandra_running():
3727- set_bootstrapped(True)
3728- break
3729-
3730-
3731-def is_schema_agreed():
3732- '''Return True if all the nodes that are up agree on a schema.'''
3733- up_ips = set(up_node_ips())
3734- # Always include ourself since we may be joining just now.
3735- up_ips.add(hookenv.unit_private_ip())
3736- raw = nodetool('describecluster')
3737- # The output of nodetool describe cluster is almost yaml,
3738- # so we use that tool once we fix the tabs.
3739- description = yaml.load(raw.expandtabs())
3740- versions = description['Cluster Information']['Schema versions'] or {}
3741-
3742- for schema, schema_ips in versions.items():
3743- schema_ips = set(schema_ips)
3744- if up_ips.issubset(schema_ips):
3745- hookenv.log('{!r} agree on schema'.format(up_ips), DEBUG)
3746- return True
3747- hookenv.log('{!r} do not agree on schema'.format(up_ips), DEBUG)
3748+ if not is_cassandra_running():
3749+ return None
3750+ raw = nodetool('netstats')
3751+ m = re.search(r'(?m)^Mode:\s+(\w+)$', raw)
3752+ if m is None:
3753+ return None
3754+ return m.group(1).upper()
3755+
3756+
3757+def is_decommissioned():
3758+ status = get_node_status()
3759+ if status in ('DECOMMISSIONED', 'LEAVING'):
3760+ hookenv.log('This node is {}'.format(status), WARNING)
3761+ return True
3762 return False
3763
3764
3765 @logged
3766-def wait_for_agreed_schema():
3767- for _ in backoff('schema agreement'):
3768- if is_schema_agreed():
3769- return
3770-
3771-
3772-def peer_ips():
3773- ips = set()
3774- relid = rollingrestart.get_peer_relation_id()
3775- if relid is not None:
3776- for unit in hookenv.related_units(relid):
3777- ip = hookenv.relation_get('private-address', unit, relid)
3778- ips.add(ip)
3779- return ips
3780-
3781-
3782-def is_all_normal(timeout=120):
3783- '''All nodes in the cluster report status Normal.
3784-
3785- Returns false if a node is joining, leaving or moving in the ring.
3786- '''
3787- is_all_normal = True
3788- try:
3789- raw = nodetool('status', 'system_auth', timeout=timeout)
3790- except subprocess.TimeoutExpired:
3791- return False
3792- node_status_re = re.compile('^(.)([NLJM])\s+([\d\.]+)\s')
3793- for line in raw.splitlines():
3794- match = node_status_re.search(line)
3795- if match is not None:
3796- updown, mode, address = match.groups()
3797- if updown == 'D':
3798- # 'Down' is unfortunately just informative. During service
3799- # teardown, nodes will disappear without decommissioning
3800- # leaving these entries.
3801- if address == hookenv.unit_private_ip():
3802- is_all_normal = False
3803- hookenv.log('Node {} (this node) is down'.format(address))
3804- else:
3805- hookenv.log('Node {} is down'.format(address))
3806- elif updown == '?': # CASSANDRA-8791
3807- is_all_normal = False
3808- hookenv.log('Node {} is transitioning'.format(address))
3809-
3810- if mode == 'L':
3811- hookenv.log('Node {} is leaving the cluster'.format(address))
3812- is_all_normal = False
3813- elif mode == 'J':
3814- hookenv.log('Node {} is joining the cluster'.format(address))
3815- is_all_normal = False
3816- elif mode == 'M':
3817- hookenv.log('Node {} is moving ring position'.format(address))
3818- is_all_normal = False
3819- return is_all_normal
3820-
3821-
3822-@logged
3823-def wait_for_normality():
3824- for _ in backoff('cluster operators to complete'):
3825- if is_all_normal():
3826- return
3827-
3828-
3829-def is_decommissioned():
3830- if not is_cassandra_running():
3831- return False # Decommissioned nodes are not shut down.
3832-
3833- for _ in backoff('stable node mode'):
3834- raw = nodetool('netstats')
3835- if 'Mode: DECOMMISSIONED' in raw:
3836- hookenv.log('This node is DECOMMISSIONED', WARNING)
3837- return True
3838- elif 'Mode: NORMAL' in raw:
3839- return False
3840-
3841-
3842-@logged
3843 def emit_describe_cluster():
3844 '''Run nodetool describecluster for the logs.'''
3845 nodetool('describecluster') # Implicit emit
3846@@ -1309,3 +927,62 @@
3847 # FOR CHARMHELPERS. This should be a constant in nrpe.py
3848 def local_plugins_dir():
3849 return '/usr/local/lib/nagios/plugins'
3850+
3851+
3852+def leader_ping():
3853+ '''Make a change in the leader settings, waking the non-leaders.'''
3854+ assert hookenv.is_leader()
3855+ last = int(hookenv.leader_get('ping') or 0)
3856+ hookenv.leader_set(ping=str(last + 1))
3857+
3858+
3859+def get_unit_superusers():
3860+ '''Return the set of units that have had their superuser accounts created.
3861+ '''
3862+ raw = hookenv.leader_get('superusers')
3863+ return set(json.loads(raw or '[]'))
3864+
3865+
3866+def set_unit_superusers(superusers):
3867+ hookenv.leader_set(superusers=json.dumps(sorted(superusers)))
3868+
3869+
3870+def status_set(state, message):
3871+ '''Set the unit status and log a message.'''
3872+ hookenv.status_set(state, message)
3873+ hookenv.log('{} unit state: {}'.format(state, message))
3874+
3875+
3876+def service_status_set(state, message):
3877+ '''Set the service status and log a message.'''
3878+ subprocess.check_call(['status-set', '--service', state, message])
3879+ hookenv.log('{} service state: {}'.format(state, message))
3880+
3881+
3882+def get_service_name(relid):
3883+ '''Return the service name for the other end of relid.'''
3884+ units = hookenv.related_units(relid)
3885+ if units:
3886+ return units[0].split('/', 1)[0]
3887+ else:
3888+ return None
3889+
3890+
3891+def peer_relid():
3892+ return coordinator.relid
3893+
3894+
3895+@logged
3896+def set_active():
3897+ '''Set happy state'''
3898+ if hookenv.unit_private_ip() in get_seed_ips():
3899+ msg = 'Live seed'
3900+ else:
3901+ msg = 'Live node'
3902+ status_set('active', msg)
3903+
3904+ if hookenv.is_leader():
3905+ n = num_nodes()
3906+ if n == 1:
3907+ n = 'Single'
3908+ service_status_set('active', '{} node cluster'.format(n))
3909
3910=== modified file 'hooks/hooks.py' (properties changed: +x to -x)
3911--- hooks/hooks.py 2015-04-28 16:20:05 +0000
3912+++ hooks/hooks.py 2015-06-30 07:36:26 +0000
3913@@ -1,5 +1,19 @@
3914 #!/usr/bin/python3
3915-
3916+# Copyright 2015 Canonical Ltd.
3917+#
3918+# This file is part of the Cassandra Charm for Juju.
3919+#
3920+# This program is free software: you can redistribute it and/or modify
3921+# it under the terms of the GNU General Public License version 3, as
3922+# published by the Free Software Foundation.
3923+#
3924+# This program is distributed in the hope that it will be useful, but
3925+# WITHOUT ANY WARRANTY; without even the implied warranties of
3926+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
3927+# PURPOSE. See the GNU General Public License for more details.
3928+#
3929+# You should have received a copy of the GNU General Public License
3930+# along with this program. If not, see <http://www.gnu.org/licenses/>.
3931 from charmhelpers import fetch
3932 from charmhelpers.core import hookenv
3933
3934@@ -18,20 +32,21 @@
3935
3936
3937 def default_hook():
3938+ if not hookenv.has_juju_version('1.24'):
3939+ hookenv.status_set('blocked', 'Requires Juju 1.24 or higher')
3940+ # Error state, since we don't have 1.24 to give a nice blocked state.
3941+ raise SystemExit(1)
3942+
3943 # These need to be imported after bootstrap() or required Python
3944 # packages may not have been installed.
3945 import definitions
3946- from loglog import loglog
3947
3948 # Only useful for debugging, or perhaps have this enabled with a config
3949 # option?
3950- loglog('/var/log/cassandra/system.log', prefix='C*: ')
3951+ ## from loglog import loglog
3952+ ## loglog('/var/log/cassandra/system.log', prefix='C*: ')
3953
3954 hookenv.log('*** {} Hook Start'.format(hookenv.hook_name()))
3955 sm = definitions.get_service_manager()
3956 sm.manage()
3957 hookenv.log('*** {} Hook Done'.format(hookenv.hook_name()))
3958-
3959-if __name__ == '__main__':
3960- bootstrap()
3961- default_hook()
3962
3963=== modified symlink 'hooks/install' (properties changed: -x to +x)
3964=== target was u'hooks.py'
3965--- hooks/install 1970-01-01 00:00:00 +0000
3966+++ hooks/install 2015-06-30 07:36:26 +0000
3967@@ -0,0 +1,20 @@
3968+#!/usr/bin/python3
3969+# Copyright 2015 Canonical Ltd.
3970+#
3971+# This file is part of the Cassandra Charm for Juju.
3972+#
3973+# This program is free software: you can redistribute it and/or modify
3974+# it under the terms of the GNU General Public License version 3, as
3975+# published by the Free Software Foundation.
3976+#
3977+# This program is distributed in the hope that it will be useful, but
3978+# WITHOUT ANY WARRANTY; without even the implied warranties of
3979+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
3980+# PURPOSE. See the GNU General Public License for more details.
3981+#
3982+# You should have received a copy of the GNU General Public License
3983+# along with this program. If not, see <http://www.gnu.org/licenses/>.
3984+import hooks
3985+if __name__ == '__main__':
3986+ hooks.bootstrap()
3987+ hooks.default_hook()
3988
3989=== added file 'hooks/leader-elected'
3990--- hooks/leader-elected 1970-01-01 00:00:00 +0000
3991+++ hooks/leader-elected 2015-06-30 07:36:26 +0000
3992@@ -0,0 +1,20 @@
3993+#!/usr/bin/python3
3994+# Copyright 2015 Canonical Ltd.
3995+#
3996+# This file is part of the Cassandra Charm for Juju.
3997+#
3998+# This program is free software: you can redistribute it and/or modify
3999+# it under the terms of the GNU General Public License version 3, as
4000+# published by the Free Software Foundation.
4001+#
4002+# This program is distributed in the hope that it will be useful, but
4003+# WITHOUT ANY WARRANTY; without even the implied warranties of
4004+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
4005+# PURPOSE. See the GNU General Public License for more details.
4006+#
4007+# You should have received a copy of the GNU General Public License
4008+# along with this program. If not, see <http://www.gnu.org/licenses/>.
4009+import hooks
4010+if __name__ == '__main__':
4011+ hooks.bootstrap()
4012+ hooks.default_hook()
4013
4014=== added file 'hooks/leader-settings-changed'
4015--- hooks/leader-settings-changed 1970-01-01 00:00:00 +0000
4016+++ hooks/leader-settings-changed 2015-06-30 07:36:26 +0000
4017@@ -0,0 +1,20 @@
4018+#!/usr/bin/python3
4019+# Copyright 2015 Canonical Ltd.
4020+#
4021+# This file is part of the Cassandra Charm for Juju.
4022+#
4023+# This program is free software: you can redistribute it and/or modify
4024+# it under the terms of the GNU General Public License version 3, as
4025+# published by the Free Software Foundation.
4026+#
4027+# This program is distributed in the hope that it will be useful, but
4028+# WITHOUT ANY WARRANTY; without even the implied warranties of
4029+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
4030+# PURPOSE. See the GNU General Public License for more details.
4031+#
4032+# You should have received a copy of the GNU General Public License
4033+# along with this program. If not, see <http://www.gnu.org/licenses/>.
4034+import hooks
4035+if __name__ == '__main__':
4036+ hooks.bootstrap()
4037+ hooks.default_hook()
4038
4039=== modified symlink 'hooks/nrpe-external-master-relation-changed' (properties changed: -x to +x)
4040=== target was u'hooks.py'
4041--- hooks/nrpe-external-master-relation-changed 1970-01-01 00:00:00 +0000
4042+++ hooks/nrpe-external-master-relation-changed 2015-06-30 07:36:26 +0000
4043@@ -0,0 +1,20 @@
4044+#!/usr/bin/python3
4045+# Copyright 2015 Canonical Ltd.
4046+#
4047+# This file is part of the Cassandra Charm for Juju.
4048+#
4049+# This program is free software: you can redistribute it and/or modify
4050+# it under the terms of the GNU General Public License version 3, as
4051+# published by the Free Software Foundation.
4052+#
4053+# This program is distributed in the hope that it will be useful, but
4054+# WITHOUT ANY WARRANTY; without even the implied warranties of
4055+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
4056+# PURPOSE. See the GNU General Public License for more details.
4057+#
4058+# You should have received a copy of the GNU General Public License
4059+# along with this program. If not, see <http://www.gnu.org/licenses/>.
4060+import hooks
4061+if __name__ == '__main__':
4062+ hooks.bootstrap()
4063+ hooks.default_hook()
4064
4065=== modified file 'hooks/relations.py'
4066--- hooks/relations.py 2015-02-02 15:52:50 +0000
4067+++ hooks/relations.py 2015-06-30 07:36:26 +0000
4068@@ -22,8 +22,22 @@
4069 from charmhelpers.core.hookenv import log, WARNING
4070 from charmhelpers.core.services.helpers import RelationContext
4071
4072-
4073-# FOR CHARMHELPERS
4074+from coordinator import coordinator
4075+
4076+
4077+class PeerRelation(RelationContext):
4078+ interface = 'cassandra-cluster'
4079+ name = 'cluster'
4080+
4081+ def is_ready(self):
4082+ # All units except the leader need to wait until the peer
4083+ # relation is available.
4084+ if coordinator.relid is not None or hookenv.is_leader():
4085+ return True
4086+ return False
4087+
4088+
4089+# FOR CHARMHELPERS (if we can integrate Juju 1.24 storage too)
4090 class StorageRelation(RelationContext):
4091 '''Wait for the block storage mount to become available.
4092
4093@@ -80,7 +94,9 @@
4094 return False
4095 return True
4096
4097- def provide_data(self):
4098+ def provide_data(self, remote_service, service_ready):
4099+ hookenv.log('Requesting mountpoint {} from {}'
4100+ .format(self._requested_mountpoint, remote_service))
4101 return dict(mountpoint=self._requested_mountpoint)
4102
4103 def needs_remount(self):
4104
4105=== removed file 'hooks/rollingrestart.py'
4106--- hooks/rollingrestart.py 2015-02-14 04:58:55 +0000
4107+++ hooks/rollingrestart.py 1970-01-01 00:00:00 +0000
4108@@ -1,257 +0,0 @@
4109-# Copyright 2015 Canonical Ltd.
4110-#
4111-# This file is part of the Cassandra Charm for Juju.
4112-#
4113-# This program is free software: you can redistribute it and/or modify
4114-# it under the terms of the GNU General Public License version 3, as
4115-# published by the Free Software Foundation.
4116-#
4117-# This program is distributed in the hope that it will be useful, but
4118-# WITHOUT ANY WARRANTY; without even the implied warranties of
4119-# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
4120-# PURPOSE. See the GNU General Public License for more details.
4121-#
4122-# You should have received a copy of the GNU General Public License
4123-# along with this program. If not, see <http://www.gnu.org/licenses/>.
4124-
4125-from datetime import datetime
4126-import os.path
4127-
4128-from charmhelpers.contrib import peerstorage
4129-from charmhelpers.core import hookenv, host
4130-from charmhelpers.core.hookenv import DEBUG
4131-
4132-
4133-def request_restart():
4134- '''Request a rolling restart.'''
4135- # We store the local restart request as a flag on the filesystem,
4136- # for rolling_restart() to deal with later. This allows
4137- # rolling_restart to know that this is a new entry to be queued for
4138- # a future hook to handle, or an existing entry in the queue that
4139- # may be restarted this hook.
4140- flag = os.path.join(hookenv.charm_dir(), '.needs-restart')
4141- if os.path.exists(flag):
4142- return
4143- host.write_file(flag, utcnow_str().encode('US-ASCII'))
4144-
4145-
4146-def is_waiting_for_restart():
4147- '''Is there an outstanding rolling restart request.'''
4148- flag = os.path.join(hookenv.charm_dir(), '.needs-restart')
4149- return os.path.exists(flag)
4150-
4151-
4152-def cancel_restart():
4153- '''Cancel the rolling restart request, if any, and dequeue.'''
4154- _enqueue(False)
4155- flag = os.path.join(hookenv.charm_dir(), '.needs-restart')
4156- if os.path.exists(flag):
4157- os.remove(flag)
4158-
4159-
4160-class DeferRestart(Exception):
4161- '''Defer the restart sequence until a future hook.
4162-
4163- If a restart callback raises this exception, no further callbacks
4164- are made and the restart request is moved to the end of the queue.
4165- The restart will be triggered again in a future hook.
4166- '''
4167-
4168-
4169-def get_restart_queue():
4170- '''The sorted list of units waiting for a rolling restart.
4171-
4172- The local unit will not be included until after rolling_restart
4173- has been called.
4174- '''
4175- relid = get_peer_relation_id()
4176- if relid is None:
4177- return []
4178-
4179- all_units = set(get_peers())
4180- all_units.add(hookenv.local_unit())
4181-
4182- queue = []
4183-
4184- # Iterate over all units, retrieving restartrequest flags.
4185- # We can't use the peerstorage helpers here, as that will only
4186- # retrieve data that has been echoed and may be out of date if
4187- # the necessary relation-changed hooks have not yet been invoked.
4188- for unit in all_units:
4189- relinfo = hookenv.relation_get(unit=unit, rid=relid)
4190- request = relinfo.get('rollingrestart_{}'.format(unit))
4191- if request is not None:
4192- queue.append((request, unit))
4193- queue.sort()
4194- return [unit for _, unit in queue]
4195-
4196-
4197-def _enqueue(flag):
4198- '''Add or remove an entry from peerstorage queue.'''
4199- relid = get_peer_relation_id()
4200- if not relid:
4201- # No peer relation, no problems. If there is no peer relation,
4202- # there are no peers to coordinate with.
4203- return
4204-
4205- unit = hookenv.local_unit()
4206- queue = get_restart_queue()
4207- if flag and unit in queue:
4208- return # Already queued.
4209-
4210- key = _peerstorage_key()
4211- value = utcnow_str() if flag else None
4212- hookenv.log('Restart queue {} = {}'.format(key, value))
4213- hookenv.relation_set(relid, {key: value})
4214-
4215-
4216-def rolling_restart(restart_hooks, prerequisites=()):
4217- '''To ensure availability, only restart one unit at a time.
4218-
4219- Returns:
4220- True if the queued restart has occurred and restart_hook called.
4221- True if no restart request has been queued.
4222- False if we are still waiting on a queued restart.
4223-
4224- restart_hooks is a list of callables that takes no parameters.
4225-
4226- For best results, call rolling_restart() unconditionally at the
4227- beginning and end of every hook.
4228-
4229- At a minimum, rolling_restart() must be called every peer
4230- relation-changed hook, and after any non-peer hook calls
4231- request_restart(), or the system will deadlock.
4232- '''
4233- _peer_echo() # Always call this, or peerstorage will fail.
4234-
4235- if not is_waiting_for_restart():
4236- cancel_restart() # Clear any errant queue entries.
4237- return True
4238-
4239- for prerequisite in prerequisites:
4240- if not prerequisite():
4241- hookenv.log('Prerequisite {} not met. Restart deferred.')
4242- _enqueue(False)
4243- _enqueue(True) # Rejoin at the end of the queue.
4244- return False
4245-
4246- def _restart():
4247- for restart_hook in restart_hooks:
4248- restart_hook()
4249- hookenv.log('Restart complete.')
4250- cancel_restart()
4251- hookenv.log('Restart queue entries purged.')
4252- return True
4253-
4254- try:
4255- # If there are no peers, restart the service now since there is
4256- # nobody to coordinate with.
4257- if len(get_peers()) == 0:
4258- hookenv.log('Restart request with no peers. Restarting.')
4259- return _restart()
4260-
4261- local_unit = hookenv.local_unit()
4262- queue = get_restart_queue()
4263-
4264- # If we are not in the restart queue, join it and restart later.
4265- # If there are peers, we cannot restart in the same hook we made
4266- # the request or we will race with other units trying to restart.
4267- if local_unit not in queue:
4268- hookenv.log('Joining rolling restart queue')
4269- _enqueue(True)
4270- return False
4271-
4272- # We joined the restart queue in a previous hook, but we still have
4273- # to wait until a peer relation-changed hook before we can do a
4274- # restart. If we did the restart in any old hook, we might not see
4275- # higher priority requests from peers.
4276- peer_relname = get_peer_relation_name()
4277- changed_hook = '{}-relation-changed'.format(peer_relname)
4278- if local_unit == queue[0] and hookenv.hook_name() == changed_hook:
4279- hookenv.log('First in rolling restart queue. Restarting.')
4280- return _restart()
4281-
4282- queue_str = ', '.join(queue)
4283- hookenv.log('Waiting in rolling restart queue ({})'.format(queue_str))
4284- return False
4285- except DeferRestart:
4286- _enqueue(False)
4287- _enqueue(True) # Rejoin at the end of the queue.
4288- return False
4289-
4290-
4291-def _peerstorage_key():
4292- return 'rollingrestart_{}'.format(hookenv.local_unit())
4293-
4294-
4295-def _peer_echo():
4296- peer_relname = get_peer_relation_name()
4297- changed_hook = '{}-relation-changed'.format(peer_relname)
4298-
4299- # If we are in the peer relation-changed hook, we must invoke
4300- # peerstorage.peer_echo() or peerstorage will fail.
4301- if hookenv.hook_name() == changed_hook:
4302- # We only want to echo the restart entry for the remote unit, as
4303- # any other information it has for other peers may be stale.
4304- includes = ['rollingrestart_{}'.format(hookenv.remote_unit())]
4305- hookenv.log('peerstorage.peer_echo for rolling restart.')
4306- hookenv.log('peer_echo(includes={!r})'.format(includes), DEBUG)
4307- # This method only works from the peer relation-changed hook.
4308- peerstorage.peer_echo(includes)
4309-
4310- # If a peer is departing, clean out any rubbish it left in the
4311- # peerstorage.
4312- departed_hook = '{}-relation-departed'.format(peer_relname)
4313- if hookenv.hook_name() == departed_hook:
4314- key = 'rollingrestart_{}'.format(hookenv.remote_unit())
4315- peerstorage.peer_store(key, None, peer_relname)
4316-
4317-
4318-def get_peer_relation_name():
4319- # Make this the charmhelpers.contrib.peerstorage default?
4320- # Although it is normal to have a single peer relation, it is
4321- # possible to have many. We use the first in alphabetical order.
4322- md = hookenv.metadata()
4323- assert 'peers' in md, 'No peer relations in metadata.yaml'
4324- return sorted(md['peers'].keys())[0]
4325-
4326-
4327-def get_peer_relation_id():
4328- for relid in hookenv.relation_ids(get_peer_relation_name()):
4329- return relid
4330- return None
4331-
4332-
4333-def get_peers():
4334- """Return the sorted list of peer units.
4335-
4336- Peers only. Does not include the local unit.
4337- """
4338- for relid in hookenv.relation_ids(get_peer_relation_name()):
4339- return sorted(hookenv.related_units(relid),
4340- key=lambda x: int(x.split('/')[-1]))
4341- return []
4342-
4343-
4344-def utcnow():
4345- return datetime.utcnow()
4346-
4347-
4348-def utcnow_str():
4349- return utcnow().strftime('%Y-%m-%d %H:%M:%S.%fZ')
4350-
4351-
4352-def make_service(restart_hooks, prerequisites=()):
4353- """Create a service dictionary for rolling restart.
4354-
4355- This needs to be a separate service, rather than a data_ready or
4356- provided_data item on an existing service, because the rolling_restart()
4357- function must be called for all hooks to avoid deadlocking the system.
4358- It would only be safe to list rolling_restart() as a data_ready item in
4359- a service with no requirements.
4360- """
4361- def data_ready(servicename):
4362- rolling_restart(restart_hooks, prerequisites)
4363-
4364- return dict(service='rollingrestart', data_ready=data_ready,
4365- stop=[], start=[])
4366
4367=== modified symlink 'hooks/stop' (properties changed: -x to +x)
4368=== target was u'hooks.py'
4369--- hooks/stop 1970-01-01 00:00:00 +0000
4370+++ hooks/stop 2015-06-30 07:36:26 +0000
4371@@ -0,0 +1,20 @@
4372+#!/usr/bin/python3
4373+# Copyright 2015 Canonical Ltd.
4374+#
4375+# This file is part of the Cassandra Charm for Juju.
4376+#
4377+# This program is free software: you can redistribute it and/or modify
4378+# it under the terms of the GNU General Public License version 3, as
4379+# published by the Free Software Foundation.
4380+#
4381+# This program is distributed in the hope that it will be useful, but
4382+# WITHOUT ANY WARRANTY; without even the implied warranties of
4383+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
4384+# PURPOSE. See the GNU General Public License for more details.
4385+#
4386+# You should have received a copy of the GNU General Public License
4387+# along with this program. If not, see <http://www.gnu.org/licenses/>.
4388+import hooks
4389+if __name__ == '__main__':
4390+ hooks.bootstrap()
4391+ hooks.default_hook()
4392
4393=== modified symlink 'hooks/upgrade-charm' (properties changed: -x to +x)
4394=== target was u'hooks.py'
4395--- hooks/upgrade-charm 1970-01-01 00:00:00 +0000
4396+++ hooks/upgrade-charm 2015-06-30 07:36:26 +0000
4397@@ -0,0 +1,20 @@
4398+#!/usr/bin/python3
4399+# Copyright 2015 Canonical Ltd.
4400+#
4401+# This file is part of the Cassandra Charm for Juju.
4402+#
4403+# This program is free software: you can redistribute it and/or modify
4404+# it under the terms of the GNU General Public License version 3, as
4405+# published by the Free Software Foundation.
4406+#
4407+# This program is distributed in the hope that it will be useful, but
4408+# WITHOUT ANY WARRANTY; without even the implied warranties of
4409+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
4410+# PURPOSE. See the GNU General Public License for more details.
4411+#
4412+# You should have received a copy of the GNU General Public License
4413+# along with this program. If not, see <http://www.gnu.org/licenses/>.
4414+import hooks
4415+if __name__ == '__main__':
4416+ hooks.bootstrap()
4417+ hooks.default_hook()
4418
4419=== modified file 'templates/cassandra_maintenance_cron.tmpl'
4420--- templates/cassandra_maintenance_cron.tmpl 2015-01-23 19:04:15 +0000
4421+++ templates/cassandra_maintenance_cron.tmpl 2015-06-30 07:36:26 +0000
4422@@ -1,4 +1,4 @@
4423 # Cassandra maintenance
4424 # Staggered weekly repairs
4425 # m h dom mon dow user command
4426-{{minute}} {{hour}} * * {{dow}} cassandra run-one-until-success nodetool repair -pr && run-one-until-success nodetool cleanup
4427+{{minute}} {{hour}} * * {{dow}} cassandra run-one-until-success nodetool repair -pr
4428
4429=== modified file 'tests/test_actions.py'
4430--- tests/test_actions.py 2015-04-10 14:42:53 +0000
4431+++ tests/test_actions.py 2015-06-30 07:36:26 +0000
4432@@ -17,7 +17,6 @@
4433 # along with this program. If not, see <http://www.gnu.org/licenses/>.
4434
4435 import errno
4436-import functools
4437 from itertools import repeat
4438 import os.path
4439 import re
4440@@ -29,17 +28,16 @@
4441 from unittest.mock import ANY, call, patch, sentinel
4442 import yaml
4443
4444+import cassandra
4445 from charmhelpers.core import hookenv
4446
4447 from tests.base import TestCaseBase
4448 import actions
4449+from coordinator import coordinator
4450 import helpers
4451
4452
4453-patch = functools.partial(patch, autospec=True) # autospec=True as default.
4454-
4455-
4456-class TestsActions(TestCaseBase):
4457+class TestActions(TestCaseBase):
4458 def test_action_wrapper(self):
4459 @actions.action
4460 def somefunc(*args, **kw):
4461@@ -61,40 +59,38 @@
4462 '** Action catch-fire/somefunc (foo)')
4463
4464 def test_revert_unchangeable_config(self):
4465- hookenv.hook_name.return_value = 'config-changed'
4466 config = hookenv.config()
4467
4468 self.assertIn('datacenter', actions.UNCHANGEABLE_KEYS)
4469
4470+ # In the first hook, revert does nothing as there is nothing to
4471+ # revert too.
4472 config['datacenter'] = 'mission_control'
4473+ self.assertTrue(config.changed('datacenter'))
4474+ actions.revert_unchangeable_config('')
4475+ self.assertEqual(config['datacenter'], 'mission_control')
4476+
4477 config.save()
4478 config.load_previous()
4479 config['datacenter'] = 'orbital_1'
4480
4481- self.assertTrue(config.changed('datacenter'))
4482-
4483 actions.revert_unchangeable_config('')
4484 self.assertEqual(config['datacenter'], 'mission_control') # Reverted
4485
4486- hookenv.log.assert_any_call(ANY, hookenv.ERROR)
4487-
4488- def test_revert_unchangeable_config_install(self):
4489- hookenv.hook_name.return_value = 'install'
4490- config = hookenv.config()
4491-
4492- self.assertIn('datacenter', actions.UNCHANGEABLE_KEYS)
4493-
4494- config['datacenter'] = 'mission_control'
4495- config.save()
4496- config.load_previous()
4497- config['datacenter'] = 'orbital_1'
4498-
4499- self.assertTrue(config.changed('datacenter'))
4500-
4501- # In the install hook, revert_unchangeable_config() does
4502- # nothing.
4503- actions.revert_unchangeable_config('')
4504- self.assertEqual(config['datacenter'], 'orbital_1')
4505+ hookenv.log.assert_any_call(ANY, hookenv.ERROR) # Logged the problem.
4506+
4507+ @patch('charmhelpers.core.hookenv.is_leader')
4508+ def test_leader_only(self, is_leader):
4509+
4510+ @actions.leader_only
4511+ def f(*args, **kw):
4512+ return args, kw
4513+
4514+ is_leader.return_value = False
4515+ self.assertIsNone(f(1, foo='bar'))
4516+
4517+ is_leader.return_value = True
4518+ self.assertEqual(f(1, foo='bar'), ((1,), dict(foo='bar')))
4519
4520 def test_set_proxy(self):
4521 # NB. Environment is already mocked.
4522@@ -315,6 +311,123 @@
4523 # handles this.
4524 self.assertFalse(check_call.called)
4525
4526+ @patch('actions._install_oracle_jre_tarball')
4527+ @patch('actions._fetch_oracle_jre')
4528+ def test_install_oracle_jre(self, fetch, install_tarball):
4529+ fetch.return_value = sentinel.tarball
4530+
4531+ actions.install_oracle_jre('')
4532+ self.assertFalse(fetch.called)
4533+ self.assertFalse(install_tarball.called)
4534+
4535+ hookenv.config()['jre'] = 'oracle'
4536+ actions.install_oracle_jre('')
4537+ fetch.assert_called_once_with()
4538+ install_tarball.assert_called_once_with(sentinel.tarball)
4539+
4540+ @patch('helpers.status_set')
4541+ @patch('urllib.request')
4542+ def test_fetch_oracle_jre(self, req, status_set):
4543+ config = hookenv.config()
4544+ url = 'https://foo.example.com/server-jre-7u42-linux-x64.tar.gz'
4545+ expected_tarball = os.path.join(hookenv.charm_dir(), 'lib',
4546+ 'server-jre-7u42-linux-x64.tar.gz')
4547+ config['private_jre_url'] = url
4548+
4549+ # Create a dummy tarball, since the mock urlretrieve won't.
4550+ os.makedirs(os.path.dirname(expected_tarball))
4551+ with open(expected_tarball, 'w'):
4552+ pass # Empty file
4553+
4554+ self.assertEqual(actions._fetch_oracle_jre(), expected_tarball)
4555+ req.urlretrieve.assert_called_once_with(url, expected_tarball)
4556+
4557+ def test_fetch_oracle_jre_local(self):
4558+ # Create an existing tarball. If it is found, it will be used
4559+ # without needing to specify a remote url or actually download
4560+ # anything.
4561+ expected_tarball = os.path.join(hookenv.charm_dir(), 'lib',
4562+ 'server-jre-7u42-linux-x64.tar.gz')
4563+ os.makedirs(os.path.dirname(expected_tarball))
4564+ with open(expected_tarball, 'w'):
4565+ pass # Empty file
4566+
4567+ self.assertEqual(actions._fetch_oracle_jre(), expected_tarball)
4568+
4569+ @patch('helpers.status_set')
4570+ def test_fetch_oracle_jre_notfound(self, status_set):
4571+ with self.assertRaises(SystemExit) as x:
4572+ actions._fetch_oracle_jre()
4573+ self.assertEqual(x.code, 0)
4574+ status_set.assert_called_once_with('blocked', ANY)
4575+
4576+ @patch('subprocess.check_call')
4577+ @patch('charmhelpers.core.host.mkdir')
4578+ @patch('os.path.isdir')
4579+ def test_install_oracle_jre_tarball(self, isdir, mkdir, check_call):
4580+ isdir.return_value = False
4581+
4582+ dest = '/usr/lib/jvm/java-8-oracle'
4583+
4584+ actions._install_oracle_jre_tarball(sentinel.tarball)
4585+ mkdir.assert_called_once_with(dest)
4586+ check_call.assert_has_calls([
4587+ call(['tar', '-xz', '-C', dest,
4588+ '--strip-components=1', '-f', sentinel.tarball]),
4589+ call(['update-alternatives', '--install',
4590+ '/usr/bin/java', 'java',
4591+ os.path.join(dest, 'bin', 'java'), '1']),
4592+ call(['update-alternatives', '--set', 'java',
4593+ os.path.join(dest, 'bin', 'java')]),
4594+ call(['update-alternatives', '--install',
4595+ '/usr/bin/javac', 'javac',
4596+ os.path.join(dest, 'bin', 'javac'), '1']),
4597+ call(['update-alternatives', '--set', 'javac',
4598+ os.path.join(dest, 'bin', 'javac')])])
4599+
4600+ @patch('os.path.exists')
4601+ @patch('subprocess.check_call')
4602+ @patch('charmhelpers.core.host.mkdir')
4603+ @patch('os.path.isdir')
4604+ def test_install_oracle_jre_tarball_already(self, isdir,
4605+ mkdir, check_call, exists):
4606+ isdir.return_value = True
4607+ exists.return_value = True # jre already installed
4608+
4609+ # Store the version previously installed.
4610+ hookenv.config()['oracle_jre_tarball'] = sentinel.tarball
4611+
4612+ dest = '/usr/lib/jvm/java-8-oracle'
4613+
4614+ actions._install_oracle_jre_tarball(sentinel.tarball)
4615+
4616+ self.assertFalse(mkdir.called) # The jvm dir already existed.
4617+
4618+ exists.assert_called_once_with('/usr/lib/jvm/java-8-oracle/bin/java')
4619+
4620+ # update-alternatives done, but tarball not extracted.
4621+ check_call.assert_has_calls([
4622+ call(['update-alternatives', '--install',
4623+ '/usr/bin/java', 'java',
4624+ os.path.join(dest, 'bin', 'java'), '1']),
4625+ call(['update-alternatives', '--set', 'java',
4626+ os.path.join(dest, 'bin', 'java')]),
4627+ call(['update-alternatives', '--install',
4628+ '/usr/bin/javac', 'javac',
4629+ os.path.join(dest, 'bin', 'javac'), '1']),
4630+ call(['update-alternatives', '--set', 'javac',
4631+ os.path.join(dest, 'bin', 'javac')])])
4632+
4633+ @patch('subprocess.check_output')
4634+ def test_emit_java_version(self, check_output):
4635+ check_output.return_value = 'Line 1\nLine 2'
4636+ actions.emit_java_version('')
4637+ check_output.assert_called_once_with(['java', '-version'],
4638+ universal_newlines=True)
4639+ hookenv.log.assert_has_calls([call(ANY),
4640+ call('JRE: Line 1'),
4641+ call('JRE: Line 2')])
4642+
4643 @patch('helpers.configure_cassandra_yaml')
4644 def test_configure_cassandra_yaml(self, configure_cassandra_yaml):
4645 # actions.configure_cassandra_yaml is just a wrapper around the
4646@@ -399,10 +512,62 @@
4647 self.assertEqual(f.read().strip(),
4648 'dc=test_dc\nrack=test_rack')
4649
4650- @patch('helpers.reset_auth_keyspace_replication')
4651- def test_reset_auth_keyspace_replication(self, helpers_reset):
4652+ @patch('helpers.connect')
4653+ @patch('helpers.get_auth_keyspace_replication')
4654+ @patch('helpers.num_nodes')
4655+ def test_needs_reset_auth_keyspace_replication(self, num_nodes,
4656+ get_auth_ks_rep,
4657+ connect):
4658+ num_nodes.return_value = 4
4659+ connect().__enter__.return_value = sentinel.session
4660+ connect().__exit__.return_value = False
4661+ get_auth_ks_rep.return_value = {'another': '8'}
4662+ self.assertTrue(actions.needs_reset_auth_keyspace_replication())
4663+
4664+ @patch('helpers.connect')
4665+ @patch('helpers.get_auth_keyspace_replication')
4666+ @patch('helpers.num_nodes')
4667+ def test_needs_reset_auth_keyspace_replication_false(self, num_nodes,
4668+ get_auth_ks_rep,
4669+ connect):
4670+ config = hookenv.config()
4671+ config['datacenter'] = 'mydc'
4672+ connect().__enter__.return_value = sentinel.session
4673+ connect().__exit__.return_value = False
4674+
4675+ num_nodes.return_value = 4
4676+ get_auth_ks_rep.return_value = {'another': '8',
4677+ 'mydc': '3'}
4678+ self.assertFalse(actions.needs_reset_auth_keyspace_replication())
4679+
4680+ @patch('helpers.set_active')
4681+ @patch('helpers.repair_auth_keyspace')
4682+ @patch('helpers.connect')
4683+ @patch('helpers.set_auth_keyspace_replication')
4684+ @patch('helpers.get_auth_keyspace_replication')
4685+ @patch('helpers.num_nodes')
4686+ @patch('charmhelpers.core.hookenv.is_leader')
4687+ def test_reset_auth_keyspace_replication(self, is_leader, num_nodes,
4688+ get_auth_ks_rep,
4689+ set_auth_ks_rep,
4690+ connect, repair, set_active):
4691+ is_leader.return_value = True
4692+ num_nodes.return_value = 4
4693+ coordinator.grants = {}
4694+ coordinator.requests = {hookenv.local_unit(): {}}
4695+ coordinator.grant('repair', hookenv.local_unit())
4696+ config = hookenv.config()
4697+ config['datacenter'] = 'mydc'
4698+ connect().__enter__.return_value = sentinel.session
4699+ connect().__exit__.return_value = False
4700+ get_auth_ks_rep.return_value = {'another': '8'}
4701+ self.assertTrue(actions.needs_reset_auth_keyspace_replication())
4702 actions.reset_auth_keyspace_replication('')
4703- helpers_reset.assert_called_once_with()
4704+ set_auth_ks_rep.assert_called_once_with(
4705+ sentinel.session,
4706+ {'class': 'NetworkTopologyStrategy', 'another': '8', 'mydc': 3})
4707+ repair.assert_called_once_with()
4708+ set_active.assert_called_once_with()
4709
4710 def test_store_unit_private_ip(self):
4711 hookenv.unit_private_ip.side_effect = None
4712@@ -410,181 +575,92 @@
4713 actions.store_unit_private_ip('')
4714 self.assertEqual(hookenv.config()['unit_private_ip'], sentinel.ip)
4715
4716- @patch('helpers.set_bootstrapped')
4717- @patch('helpers.unit_number')
4718- def test_set_unit_zero_bootstrapped(self, unit_number, set_bootstrapped):
4719- unit_number.return_value = 0
4720- hookenv.hook_name.return_value = 'cluster-whatever'
4721- actions.set_unit_zero_bootstrapped('')
4722- set_bootstrapped.assert_called_once_with(True)
4723- set_bootstrapped.reset_mock()
4724-
4725- # Noop if unit number is not 0.
4726- unit_number.return_value = 2
4727- actions.set_unit_zero_bootstrapped('')
4728- self.assertFalse(set_bootstrapped.called)
4729-
4730- # Noop if called from a non peer relation hook. Unfortunatly,
4731- # attempting to set attributes on the peer relation fails before
4732- # the peer relation has been joined.
4733- unit_number.return_value = 0
4734- hookenv.hook_name.return_value = 'install'
4735- actions.set_unit_zero_bootstrapped('')
4736- self.assertFalse(set_bootstrapped.called)
4737-
4738+ @patch('charmhelpers.core.host.service_start')
4739+ @patch('helpers.status_set')
4740+ @patch('helpers.actual_seed_ips')
4741+ @patch('helpers.get_seed_ips')
4742+ @patch('relations.StorageRelation.needs_remount')
4743+ @patch('helpers.is_bootstrapped')
4744 @patch('helpers.is_cassandra_running')
4745 @patch('helpers.is_decommissioned')
4746- @patch('rollingrestart.request_restart')
4747- def test_maybe_schedule_restart_decommissioned(self, request_restart,
4748- is_running, is_decom):
4749+ def test_needs_restart(self, is_decom, is_running, is_bootstrapped,
4750+ needs_remount, seed_ips, actual_seeds,
4751+ status_set, service_start):
4752+ is_decom.return_value = False
4753+ is_running.return_value = True
4754+ needs_remount.return_value = False
4755+ seed_ips.return_value = set(['1.2.3.4'])
4756+ actual_seeds.return_value = set(['1.2.3.4'])
4757+
4758+ config = hookenv.config()
4759+ config['configured_seeds'] = list(sorted(seed_ips()))
4760+ config.save()
4761+ config.load_previous() # Ensure everything flagged as unchanged.
4762+
4763+ self.assertFalse(actions.needs_restart())
4764+
4765+ # Decommissioned nodes are not restarted.
4766 is_decom.return_value = True
4767- is_running.return_value = True
4768- actions.maybe_schedule_restart('')
4769- self.assertFalse(request_restart.called)
4770-
4771- @patch('helpers.is_decommissioned')
4772- @patch('helpers.seed_ips')
4773- @patch('helpers.is_cassandra_running')
4774- @patch('relations.StorageRelation')
4775- @patch('rollingrestart.request_restart')
4776- def test_maybe_schedule_restart_need_remount(self, request_restart,
4777- storage_relation, is_running,
4778- seed_ips, is_decom):
4779- config = hookenv.config()
4780-
4781- is_decom.return_value = False
4782- is_running.return_value = True
4783-
4784- # Storage says we need to restart.
4785- storage_relation().needs_remount.return_value = True
4786-
4787- # No new seeds
4788- seed_ips.return_value = set(['a'])
4789- config['configured_seeds'] = sorted(seed_ips())
4790- config.save()
4791- config.load_previous()
4792-
4793- # IP address is unchanged.
4794- config['unit_private_ip'] = hookenv.unit_private_ip()
4795-
4796- # Config items are unchanged.
4797- config.save()
4798- config.load_previous()
4799-
4800- actions.maybe_schedule_restart('')
4801- request_restart.assert_called_once_with()
4802- hookenv.log.assert_any_call('Mountpoint changed. '
4803- 'Restart and migration required.')
4804-
4805- @patch('helpers.is_decommissioned')
4806- @patch('helpers.seed_ips')
4807- @patch('helpers.is_cassandra_running')
4808- @patch('relations.StorageRelation')
4809- @patch('rollingrestart.request_restart')
4810- def test_maybe_schedule_restart_unchanged(self, request_restart,
4811- storage_relation, is_running,
4812- seed_ips, is_decom):
4813- config = hookenv.config()
4814- config['configured_seeds'] = ['a']
4815- config.save()
4816- config.load_previous()
4817-
4818- is_decom.return_value = False
4819- is_running.return_value = True
4820-
4821- # Storage says we do not need to restart.
4822- storage_relation().needs_remount.return_value = False
4823-
4824- # No new seeds
4825- seed_ips.return_value = set(['a'])
4826- config['configured_seeds'] = sorted(seed_ips())
4827- config.save()
4828- config.load_previous()
4829-
4830- # IP address is unchanged.
4831- config['unit_private_ip'] = hookenv.unit_private_ip()
4832-
4833- # Config items are unchanged, except for ones that do not
4834- # matter.
4835- config.save()
4836- config.load_previous()
4837- config['package_status'] = 'new'
4838- self.assertTrue(config.changed('package_status'))
4839-
4840- actions.maybe_schedule_restart('')
4841- self.assertFalse(request_restart.called)
4842-
4843- @patch('helpers.is_decommissioned')
4844- @patch('helpers.is_cassandra_running')
4845- @patch('helpers.seed_ips')
4846- @patch('relations.StorageRelation')
4847- @patch('rollingrestart.request_restart')
4848- def test_maybe_schedule_restart_config_changed(self, request_restart,
4849- storage_relation,
4850- seed_ips, is_running,
4851- is_decom):
4852- config = hookenv.config()
4853-
4854- is_decom.return_value = False
4855- is_running.return_value = True
4856-
4857- # Storage says we do not need to restart.
4858- storage_relation().needs_remount.return_value = False
4859-
4860- # IP address is unchanged.
4861- config['unit_private_ip'] = hookenv.unit_private_ip()
4862-
4863- # Config items are changed.
4864- config.save()
4865- config.load_previous()
4866- config['package_status'] = 'new'
4867- self.assertTrue(config.changed('package_status')) # Doesn't matter.
4868- config['max_heap_size'] = 'lots and lots'
4869- self.assertTrue(config.changed('max_heap_size')) # Requires restart.
4870-
4871- actions.maybe_schedule_restart('')
4872- request_restart.assert_called_once_with()
4873- hookenv.log.assert_any_call('max_heap_size changed. Restart required.')
4874-
4875- @patch('helpers.is_decommissioned')
4876- @patch('helpers.seed_ips')
4877- @patch('helpers.is_cassandra_running')
4878- @patch('relations.StorageRelation')
4879- @patch('rollingrestart.request_restart')
4880- def test_maybe_schedule_restart_ip_changed(self, request_restart,
4881- storage_relation, is_running,
4882- seed_ips, is_decom):
4883- is_decom.return_value = False
4884- is_running.return_value = True
4885-
4886- # Storage says we do not need to restart.
4887- storage_relation().needs_remount.return_value = False
4888-
4889- # No new seeds
4890- seed_ips.return_value = set(['a'])
4891- config = hookenv.config()
4892- config['configured_seeds'] = sorted(seed_ips())
4893- config.save()
4894- config.load_previous()
4895-
4896- # Config items are unchanged.
4897- config = hookenv.config()
4898- config.save()
4899- config.load_previous()
4900-
4901- actions.store_unit_private_ip('') # IP address change
4902-
4903- actions.maybe_schedule_restart('')
4904- request_restart.assert_called_once_with()
4905- hookenv.log.assert_any_call('Unit IP address changed. '
4906- 'Restart required.')
4907-
4908- @patch('helpers.is_cassandra_running')
4909- @patch('rollingrestart.request_restart')
4910- def test_maybe_schedule_restart_down(self, request_restart, is_running):
4911+ self.assertFalse(actions.needs_restart())
4912+ is_decom.return_value = False
4913+ self.assertFalse(actions.needs_restart())
4914+
4915+ # Nodes not running need to be restarted.
4916 is_running.return_value = False
4917- actions.maybe_schedule_restart('')
4918- request_restart.assert_called_once_with()
4919+ self.assertTrue(actions.needs_restart())
4920+ is_running.return_value = True
4921+ self.assertFalse(actions.needs_restart())
4922+
4923+ # If we have a new mountpoint, we need to restart in order to
4924+ # migrate data.
4925+ needs_remount.return_value = True
4926+ self.assertTrue(actions.needs_restart())
4927+ needs_remount.return_value = False
4928+ self.assertFalse(actions.needs_restart())
4929+
4930+ # Certain changed config items trigger a restart.
4931+ config['max_heap_size'] = '512M'
4932+ self.assertTrue(actions.needs_restart())
4933+ config.save()
4934+ config.load_previous()
4935+ self.assertFalse(actions.needs_restart())
4936+
4937+ # A new IP address requires a restart.
4938+ config['unit_private_ip'] = 'new'
4939+ self.assertTrue(actions.needs_restart())
4940+ config.save()
4941+ config.load_previous()
4942+ self.assertFalse(actions.needs_restart())
4943+
4944+ # If the seeds have changed, we need to restart.
4945+ seed_ips.return_value = set(['9.8.7.6'])
4946+ actual_seeds.return_value = set(['9.8.7.6'])
4947+ self.assertTrue(actions.needs_restart())
4948+ is_running.side_effect = iter([False, True])
4949+ helpers.start_cassandra()
4950+ is_running.side_effect = None
4951+ is_running.return_value = True
4952+ self.assertFalse(actions.needs_restart())
4953+
4954+ @patch('charmhelpers.core.hookenv.is_leader')
4955+ @patch('helpers.is_bootstrapped')
4956+ @patch('helpers.ensure_database_directories')
4957+ @patch('helpers.remount_cassandra')
4958+ @patch('helpers.start_cassandra')
4959+ @patch('helpers.stop_cassandra')
4960+ @patch('helpers.status_set')
4961+ def test_maybe_restart(self, status_set, stop_cassandra, start_cassandra,
4962+ remount, ensure_directories, is_bootstrapped,
4963+ is_leader):
4964+ coordinator.grants = {}
4965+ coordinator.requests = {hookenv.local_unit(): {}}
4966+ coordinator.relid = 'cluster:1'
4967+ coordinator.grant('restart', hookenv.local_unit())
4968+ actions.maybe_restart('')
4969+ stop_cassandra.assert_called_once_with()
4970+ remount.assert_called_once_with()
4971+ ensure_directories.assert_called_once_with()
4972+ start_cassandra.assert_called_once_with()
4973
4974 @patch('helpers.stop_cassandra')
4975 def test_stop_cassandra(self, helpers_stop_cassandra):
4976@@ -596,15 +672,29 @@
4977 actions.start_cassandra('ignored')
4978 helpers_start_cassandra.assert_called_once_with()
4979
4980- @patch('helpers.ensure_unit_superuser')
4981- def test_ensure_unit_superuser(self, helpers_ensure_unit_superuser):
4982- actions.ensure_unit_superuser('ignored')
4983- helpers_ensure_unit_superuser.assert_called_once_with()
4984+ @patch('os.path.isdir')
4985+ @patch('helpers.get_all_database_directories')
4986+ @patch('helpers.set_io_scheduler')
4987+ def test_reset_all_io_schedulers(self, set_io_scheduler, dbdirs, isdir):
4988+ hookenv.config()['io_scheduler'] = sentinel.io_scheduler
4989+ dbdirs.return_value = dict(
4990+ data_file_directories=[sentinel.d1, sentinel.d2],
4991+ commitlog_directory=sentinel.cl,
4992+ saved_caches_directory=sentinel.sc)
4993+ isdir.return_value = True
4994+ actions.reset_all_io_schedulers('')
4995+ set_io_scheduler.assert_has_calls([
4996+ call(sentinel.io_scheduler, sentinel.d1),
4997+ call(sentinel.io_scheduler, sentinel.d2),
4998+ call(sentinel.io_scheduler, sentinel.cl),
4999+ call(sentinel.io_scheduler, sentinel.sc)],
5000+ any_order=True)
The diff has been truncated for viewing.

Subscribers

People subscribed via source and target branches

to all changes: