Merge lp:~stub/charms/trusty/cassandra/wait-for-joining into lp:charms/trusty/cassandra

Proposed by Stuart Bishop
Status: Merged
Merge reported by: Stuart Bishop
Merged at revision: not available
Proposed branch: lp:~stub/charms/trusty/cassandra/wait-for-joining
Merge into: lp:charms/trusty/cassandra
Prerequisite: lp:~stub/charms/trusty/cassandra/ensure-thrift
Diff against target: 116 lines (+36/-17)
2 files modified
hooks/actions.py (+34/-15)
tests/test_actions.py (+2/-2)
To merge this branch: bzr merge lp:~stub/charms/trusty/cassandra/wait-for-joining
Reviewer Review Type Date Requested Status
Cory Johns (community) Needs Information
Review via email: mp+286993@code.launchpad.net

Commit message

Make bootstrap procedure safer

Description of the change

We were previously capping the replication factor of the system_auth keyspace to three. I have since found documentation recommending increasing the replication factor so every node has a copy of this data.

Also, a somewhat undocumented caveat to bootstrap is that, at least when using vnodes, only one node should be JOINING at a time. Keep hold of the lock until the bootstrapping node has become NORMAL, which also neatly stops other nodes from rebooting while it is attempting to stream data.

To post a comment you must log in.
Revision history for this message
Cory Johns (johnsca) wrote :

Overall this looks good, but I'm a bit concerned about the potential for a deadlock in the charm due to the JOINING / NORMAL wait. Would it be possible to add a timeout to that loop that, if exceeded, puts the charm into a "waiting" state and have it re-try on the next update-status hook?

review: Needs Information
Revision history for this message
Stuart Bishop (stub) wrote :

@johnsca Yes, I think that would be best. It may not be a minor operation though, as I need to ensure that other units don't attempt to join until the waiting unit has completed and the charmhelpers.coordinator lock being held only lasts for the duration of the hook. Or perhaps Cassandra is robust enough now in the supported versions that I can reliably add several units to the cluster at once and they will happily block until it is their turn to bootstrap (which would greatly simplify things). The other issue I need to avoid is to keep failed units in their blocked state, but that seems fairly minor. This will of course fall out much more naturally when I get a chance to rewrite the charm for charms.reactive.

It would still be good to land this as is IMO, as a deadlocked unit is preferable to the current behaviour of not noticing the issue and ploughing ahead with operations potentially dangerous to the whole cluster.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'hooks/actions.py'
2--- hooks/actions.py 2016-02-24 10:02:13 +0000
3+++ hooks/actions.py 2016-02-24 10:02:13 +0000
4@@ -432,18 +432,13 @@
5 def needs_reset_auth_keyspace_replication():
6 '''Guard for reset_auth_keyspace_replication.'''
7 num_nodes = helpers.num_nodes()
8- n = min(num_nodes, 3)
9 datacenter = hookenv.config()['datacenter']
10 with helpers.connect() as session:
11 strategy_opts = helpers.get_auth_keyspace_replication(session)
12 rf = int(strategy_opts.get(datacenter, -1))
13 hookenv.log('system_auth rf={!r}'.format(strategy_opts))
14- # If the node count has increased, we will change the rf.
15- # If the node count is decreasing, we do nothing as the service
16- # may be being destroyed.
17- if rf < n:
18- return True
19- return False
20+ # If the node count has changed, we should change the rf.
21+ return rf != num_nodes
22
23
24 @leader_only
25@@ -453,23 +448,23 @@
26 def reset_auth_keyspace_replication():
27 # Cassandra requires you to manually set the replication factor of
28 # the system_auth keyspace, to ensure availability and redundancy.
29- # We replication factor in this service's DC can be no higher than
30- # the number of bootstrapped nodes. We also cap this at 3 to ensure
31- # we don't have too many seeds.
32+ # The recommendation is to set the replication factor so that every
33+ # node has a copy.
34 num_nodes = helpers.num_nodes()
35- n = min(num_nodes, 3)
36 datacenter = hookenv.config()['datacenter']
37 with helpers.connect() as session:
38 strategy_opts = helpers.get_auth_keyspace_replication(session)
39 rf = int(strategy_opts.get(datacenter, -1))
40 hookenv.log('system_auth rf={!r}'.format(strategy_opts))
41- if rf != n:
42+ if rf != num_nodes:
43 strategy_opts['class'] = 'NetworkTopologyStrategy'
44- strategy_opts[datacenter] = n
45+ strategy_opts[datacenter] = num_nodes
46 if 'replication_factor' in strategy_opts:
47 del strategy_opts['replication_factor']
48 helpers.set_auth_keyspace_replication(session, strategy_opts)
49- helpers.repair_auth_keyspace()
50+ if rf < num_nodes:
51+ # Increasing rf, need to run repair.
52+ helpers.repair_auth_keyspace()
53 helpers.set_active()
54
55
56@@ -565,7 +560,9 @@
57
58 Per documented procedure for adding new units to a cluster, wait 2
59 minutes if the unit has just bootstrapped to ensure other units
60- do not attempt bootstrap too soon.
61+ do not attempt bootstrap too soon. Also, wait until completed joining
62+ to ensure we keep the lock and ensure other nodes don't restart or
63+ bootstrap.
64 '''
65 if not helpers.is_bootstrapped():
66 if coordinator.relid is not None:
67@@ -573,6 +570,28 @@
68 hookenv.log('Post-bootstrap 2 minute delay')
69 time.sleep(120) # Must wait 2 minutes between bootstrapping nodes.
70
71+ join_msg_set = False
72+ while True:
73+ status = helpers.get_node_status()
74+ if status == 'NORMAL':
75+ break
76+ elif status == 'JOINING':
77+ if not join_msg_set:
78+ helpers.status_set('maintenance', 'Still joining cluster')
79+ join_msg_set = True
80+ time.sleep(10)
81+ continue
82+ else:
83+ if status is None:
84+ helpers.status_set('blocked',
85+ 'Unexpectedly shutdown during '
86+ 'bootstrap')
87+ else:
88+ helpers.status_set('blocked',
89+ 'Failed to bootstrap ({})'
90+ ''.format(status))
91+ raise SystemExit(0)
92+
93 # Unconditionally call this to publish the bootstrapped flag to
94 # the peer relation, as the first unit was bootstrapped before
95 # the peer relation existed.
96
97=== modified file 'tests/test_actions.py'
98--- tests/test_actions.py 2016-02-24 10:02:13 +0000
99+++ tests/test_actions.py 2016-02-24 10:02:13 +0000
100@@ -535,7 +535,7 @@
101 connect().__enter__.return_value = sentinel.session
102 connect().__exit__.return_value = False
103
104- num_nodes.return_value = 4
105+ num_nodes.return_value = 3
106 get_auth_ks_rep.return_value = {'another': '8',
107 'mydc': '3'}
108 self.assertFalse(actions.needs_reset_auth_keyspace_replication())
109@@ -565,7 +565,7 @@
110 actions.reset_auth_keyspace_replication('')
111 set_auth_ks_rep.assert_called_once_with(
112 sentinel.session,
113- {'class': 'NetworkTopologyStrategy', 'another': '8', 'mydc': 3})
114+ {'class': 'NetworkTopologyStrategy', 'another': '8', 'mydc': 4})
115 repair.assert_called_once_with()
116 set_active.assert_called_once_with()
117

Subscribers

People subscribed via source and target branches

to all changes: