Merge lp:~hopem/charms/trusty/swift-proxy/lp1448884 into lp:~openstack-charmers-archive/charms/trusty/swift-proxy/next

Proposed by Edward Hope-Morley
Status: Work in progress
Proposed branch: lp:~hopem/charms/trusty/swift-proxy/lp1448884
Merge into: lp:~openstack-charmers-archive/charms/trusty/swift-proxy/next
Diff against target: 941 lines (+380/-140)
5 files modified
README.Swift_ring_management (+28/-0)
hooks/swift_hooks.py (+128/-45)
lib/swift_utils.py (+133/-65)
unit_tests/test_swift_hooks.py (+11/-9)
unit_tests/test_swift_utils.py (+80/-21)
To merge this branch: bzr merge lp:~hopem/charms/trusty/swift-proxy/lp1448884
Reviewer Review Type Date Requested Status
OpenStack Charmers Pending
Review via email: mp+282296@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Chris Holcombe (xfactor973) wrote :

Ed, First off this is fantastic work! This will greatly help the swift charm along. I have a few questions though.

Could you talk a little bit about what cases this solves? For example:
1. Adding a new node to the swift cluster
2. Deleting a node from the cluster
3. Reimaging a previous node and putting it back into service
4. A node goes offline for repairs and then comes back later ( say a day or a week )
Can this handle all these cases? It's hard to tell just from sifting through the code.

Are there potential race conditions with this that can leave the cluster is a broken state if the leader changes midway through the sync? Just curious.

I had a few little questions inline in the code about variable names.

Revision history for this message
Edward Hope-Morley (hopem) wrote :

Hey Chris, sure thing and thanks for the review.

I've added README.Swift_ring_management with a basic explanation of swift ring
and builder management in the charm. We can always add more overtime as-and-when.

First i'd like to stress that this is mostly a refactor (upgrade safe) that is aimed at making the ring sync process more robust and cleaner to facilitate easier debugging.

With regards to:

   deleting nodes - for storage nodes, this is currently a manual process and
                    something that we need to grow the ability to do in the
                    charm (probably using actions). For proxy units,
                    information about proxy units/hosts is not in the
                    rings/builders so as long as they are synced and the leader
                    switches, removing a proxy unit should not interrupt
                    service.

   repairing nodes - again, same as above

I've responded to your comment inline as well.

Revision history for this message
Edward Hope-Morley (hopem) wrote :

(this is a hopefully better formatted version of my previous comment)

Hey Chris, sure thing.

I've added README.Swift_ring_management with a basic explanation of swift ring and builder management in the charm. We can always add more overtime as-and-when.

With regards to:

   deleting nodes - for storage nodes, this is currently a manual process and something that we need to grow the ability to do in the charm (probably using actions). For proxy units, information about proxy units/hosts is not in the rings/builders so as long as they are synced and the leader switches, removing a proxy unit should not interrupt service.

   repairing nodes - again, same as above

I've responded to your comment inline as well.

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_lint_check #835 swift-proxy-next for hopem mp282296
    LINT OK: passed

Build: http://10.245.162.36:8080/job/charm_lint_check/835/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_unit_test #739 swift-proxy-next for hopem mp282296
    UNIT OK: passed

Build: http://10.245.162.36:8080/job/charm_unit_test/739/

Revision history for this message
uosci-testing-bot (uosci-testing-bot) wrote :

charm_amulet_test #337 swift-proxy-next for hopem mp282296
    AMULET OK: passed

Build: http://10.245.162.36:8080/job/charm_amulet_test/337/

Revision history for this message
Edward Hope-Morley (hopem) wrote :

Based on the new workflow I've migrated this to git for review - https://review.openstack.org/#/c/287123/

Unmerged revisions

136. By Edward Hope-Morley

add README.Swift_ring_management

135. By Edward Hope-Morley

sync /next

134. By Edward Hope-Morley

[hopem,r=]

Make the ring sync code clearer and implement
tolerance for leader-switch during or after
sync.

Closes-Bug: 1448884

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== added file 'README.Swift_ring_management'
2--- README.Swift_ring_management 1970-01-01 00:00:00 +0000
3+++ README.Swift_ring_management 2016-02-18 13:23:54 +0000
4@@ -0,0 +1,28 @@
5+== Swift Charm Ring Management ==
6+
7+Swift uses rings and builders to manage data distribution across storage devices
8+in the cluster. More information on this can be found in the upstream
9+documentation [0].
10+
11+In order to function correctly, the rings and builders need to be the same across
12+all swift proxy units and the rings need to be distributed to storage units. The
13+swift proxy charm achieves this by electing a leader unit and having that unit
14+manage ring and builder files by updating them when new nodes or devices are added
15+to the cluster and distrbuting those files to other nodes in the cluster.
16+
17+Since over time the leader may change, rings syncs use acknowledgements from peer
18+units to determin whether a synchronisation has completed. This was if the leader
19+swicthes to another unit, we are able to know that that unit has up-to-date
20+ring and builder files.
21+
22+When new devices are added to storage units, the leader proxy unit is notified and
23+adds them to the ring. Once complete, the leader unit will broadcast a notification
24+that rings and builders are ready to be synced across the cluster (only proxy units
25+get builders) and each unit in the cluster should then begin syncing from the leader
26+and ackmowledge receipt.
27+
28+During synchronisation, swift-proxy services are stopped in order to avoid having
29+requests being handled while rings are being updated.
30+
31+
32+[0] http://docs.openstack.org/developer/swift/overview_ring.html
33
34=== modified file 'hooks/swift_hooks.py'
35--- hooks/swift_hooks.py 2015-11-27 09:48:45 +0000
36+++ hooks/swift_hooks.py 2016-02-18 13:23:54 +0000
37@@ -33,6 +33,10 @@
38 get_first_available_value,
39 all_responses_equal,
40 ensure_www_dir_permissions,
41+ sync_builders_and_rings_if_changed,
42+ cluster_sync_rings,
43+ is_most_recent_timestamp,
44+ timestamps_available,
45 is_paused,
46 pause_aware_restart_on_change,
47 assess_status,
48@@ -143,7 +147,7 @@
49 do_openstack_upgrade(CONFIGS)
50 status_set('maintenance', 'Running openstack upgrade')
51
52- status_set('maintenance', 'Updating and balancing rings')
53+ status_set('maintenance', 'Updating and (maybe) balancing rings')
54 update_rings(min_part_hours=config('min-hours'))
55
56 if not config('disable-ring-balance') and is_elected_leader(SWIFT_HA_RES):
57@@ -322,7 +326,7 @@
58 private_addr = unit_get('private-address')
59
60
61-def all_peers_stopped(responses):
62+def is_all_peers_stopped(responses):
63 """Establish whether all peers have stopped their proxy services.
64
65 Each peer unit will set stop-proxy-service-ack to rq value to indicate that
66@@ -336,11 +340,15 @@
67 ack_key = SwiftProxyClusterRPC.KEY_STOP_PROXY_SVC_ACK
68 token = relation_get(attribute=rq_key, unit=local_unit())
69 if not token or token != responses[0].get(ack_key):
70- log("Unmatched token in ack (expected=%s, got=%s)" %
71+ log("Token mismatch, rq and ack tokens differ (expected=%s, "
72+ "got=%s)" %
73 (token, responses[0].get(ack_key)), level=DEBUG)
74 return False
75
76 if not all_responses_equal(responses, ack_key):
77+ log("Not all ack responses are equal. Either we are still waiting "
78+ "for responses or we were not the request originator.",
79+ level=DEBUG)
80 return False
81
82 return True
83@@ -354,20 +362,44 @@
84 log("Cluster changed by unit=%s (local is leader)" % (remote_unit()),
85 level=DEBUG)
86
87- # If we have received an ack, check other units
88- settings = relation_get() or {}
89- ack_key = SwiftProxyClusterRPC.KEY_STOP_PROXY_SVC_ACK
90-
91- # Protect against leader changing mid-sync
92- if settings.get(SwiftProxyClusterRPC.KEY_STOP_PROXY_SVC):
93- log("Sync request received yet this is leader unit. This would "
94- "indicate that the leader has changed mid-sync - stopping proxy "
95- "and notifying peers", level=ERROR)
96- service_stop('swift-proxy')
97- SwiftProxyClusterRPC().notify_leader_changed()
98- return
99- elif ack_key in settings:
100- token = settings[ack_key]
101+ rx_settings = relation_get() or {}
102+ tx_settings = relation_get(unit=local_unit()) or {}
103+
104+ rx_rq_token = rx_settings.get(SwiftProxyClusterRPC.KEY_STOP_PROXY_SVC)
105+ rx_ack_token = rx_settings.get(SwiftProxyClusterRPC.KEY_STOP_PROXY_SVC_ACK)
106+
107+ tx_rq_token = tx_settings.get(SwiftProxyClusterRPC.KEY_STOP_PROXY_SVC)
108+ tx_ack_token = tx_settings.get(SwiftProxyClusterRPC.KEY_STOP_PROXY_SVC_ACK)
109+
110+ rx_leader_changed = \
111+ rx_settings.get(SwiftProxyClusterRPC.KEY_NOTIFY_LEADER_CHANGED)
112+ if rx_leader_changed:
113+ log("Leader change notification received and this is leader so "
114+ "retrying sync.", level=INFO)
115+ # FIXME: check that we were previously part of a successful sync to
116+ # ensure we have good rings.
117+ cluster_sync_rings(peers_only=tx_settings.get('peers-only', False),
118+ token=rx_leader_changed)
119+ return
120+
121+ rx_resync_request = \
122+ rx_settings.get(SwiftProxyClusterRPC.KEY_REQUEST_RESYNC)
123+ resync_request_ack_key = SwiftProxyClusterRPC.KEY_REQUEST_RESYNC_ACK
124+ tx_resync_request_ack = tx_settings.get(resync_request_ack_key)
125+ if rx_resync_request and tx_resync_request_ack != rx_resync_request:
126+ log("Unit '%s' has requested a resync" % (remote_unit()),
127+ level=INFO)
128+ cluster_sync_rings(peers_only=True)
129+ relation_set(**{resync_request_ack_key: rx_resync_request})
130+ return
131+
132+ # If we have received an ack token ensure it is not associated with a
133+ # request we received from another peer. If it is, this would indicate
134+ # a leadership change during a sync and this unit will abort the sync or
135+ # attempt to restore the original leader so to be able to complete the
136+ # sync.
137+
138+ if rx_ack_token and rx_ack_token == tx_rq_token:
139 # Find out if all peer units have been stopped.
140 responses = []
141 for rid in relation_ids('cluster'):
142@@ -375,21 +407,38 @@
143 responses.append(relation_get(rid=rid, unit=unit))
144
145 # Ensure all peers stopped before starting sync
146- if all_peers_stopped(responses):
147+ if is_all_peers_stopped(responses):
148 key = 'peers-only'
149 if not all_responses_equal(responses, key, must_exist=False):
150 msg = ("Did not get equal response from every peer unit for "
151 "'%s'" % (key))
152 raise SwiftProxyCharmException(msg)
153
154- peers_only = int(get_first_available_value(responses, key,
155- default=0))
156+ peers_only = bool(get_first_available_value(responses, key,
157+ default=0))
158 log("Syncing rings and builders (peers-only=%s)" % (peers_only),
159 level=DEBUG)
160- broadcast_rings_available(token, storage=not peers_only)
161+ broadcast_rings_available(broker_token=rx_ack_token,
162+ storage=not peers_only)
163 else:
164 log("Not all peer apis stopped - skipping sync until all peers "
165 "ready (got %s)" % (responses), level=INFO)
166+ elif ((rx_ack_token and (rx_ack_token == tx_ack_token)) or
167+ (rx_rq_token and (rx_rq_token == rx_ack_token))):
168+ log("It appears that the cluster leader has changed mid-sync - "
169+ "stopping proxy service", level=WARNING)
170+ service_stop('swift-proxy')
171+ broker = rx_settings.get('builder-broker')
172+ if broker:
173+ # If we get here, manual intervention will be required in order
174+ # to restore the cluster
175+ log("Failed to restore previous broker '%s' as leader" %
176+ (broker), level=ERROR)
177+ else:
178+ log("No builder-broker on rx_settings relation from '%s' - unable "
179+ "to attempt leader restore" % (remote_unit()), level=INFO)
180+ else:
181+ log("Not taking any sync actions", level=DEBUG)
182
183 CONFIGS.write_all()
184
185@@ -401,31 +450,74 @@
186 """
187 log("Cluster changed by unit=%s (local is non-leader)" % (remote_unit()),
188 level=DEBUG)
189- settings = relation_get() or {}
190+ rx_settings = relation_get() or {}
191+ tx_settings = relation_get(unit=local_unit()) or {}
192+
193+ token = rx_settings.get(SwiftProxyClusterRPC.KEY_NOTIFY_LEADER_CHANGED)
194+ if token:
195+ log("Leader-changed notification received from peer unit. Since "
196+ "this most likely occurred during a ring sync proxies will "
197+ "be disabled until the leader is restored and a fresh sync "
198+ "request is set out", level=WARNING)
199+ service_stop("swift-proxy")
200+ return
201+
202+ rx_rq_token = rx_settings.get(SwiftProxyClusterRPC.KEY_STOP_PROXY_SVC)
203
204 # Check whether we have been requested to stop proxy service
205- rq_key = SwiftProxyClusterRPC.KEY_STOP_PROXY_SVC
206- token = settings.get(rq_key, None)
207- if token:
208+ if rx_rq_token:
209 log("Peer request to stop proxy service received (%s) - sending ack" %
210- (token), level=INFO)
211+ (rx_rq_token), level=INFO)
212 service_stop('swift-proxy')
213- peers_only = settings.get('peers-only', None)
214- rq = SwiftProxyClusterRPC().stop_proxy_ack(echo_token=token,
215+ peers_only = rx_settings.get('peers-only', None)
216+ rq = SwiftProxyClusterRPC().stop_proxy_ack(echo_token=rx_rq_token,
217 echo_peers_only=peers_only)
218 relation_set(relation_settings=rq)
219 return
220
221 # Check if there are any builder files we can sync from the leader.
222- log("Non-leader peer - checking if updated rings available", level=DEBUG)
223- broker = settings.get('builder-broker', None)
224+ broker = rx_settings.get('builder-broker', None)
225+ broker_token = rx_settings.get('broker-token', None)
226+ broker_timestamp = rx_settings.get('broker-timestamp', None)
227+ tx_ack_token = tx_settings.get(SwiftProxyClusterRPC.KEY_STOP_PROXY_SVC_ACK)
228 if not broker:
229- log("No update available", level=DEBUG)
230+ log("No ring/builder update available", level=DEBUG)
231 if not is_paused():
232 service_start('swift-proxy')
233- return
234-
235- builders_only = int(settings.get('sync-only-builders', 0))
236+
237+ return
238+ elif broker_token:
239+ if tx_ack_token:
240+ if broker_token == tx_ack_token:
241+ log("Broker and ACK tokens match (%s)" % (broker_token),
242+ level=DEBUG)
243+ else:
244+ log("Received ring/builder update notification but tokens do "
245+ "not match (broker-token=%s/ack-token=%s)" %
246+ (broker_token, tx_ack_token), level=WARNING)
247+ return
248+ else:
249+ log("Broker token available without handshake, assuming we just "
250+ "joined and rings won't change", level=DEBUG)
251+ else:
252+ log("Not taking any sync actions", level=DEBUG)
253+ return
254+
255+ # If we upgrade from cluster that did not use timestamps, the new peer will
256+ # need to request a re-sync from the leader
257+ if not is_most_recent_timestamp(broker_timestamp):
258+ if not timestamps_available(excluded_unit=remote_unit()):
259+ log("Requesting resync")
260+ rq = SwiftProxyClusterRPC().request_resync(broker_token)
261+ relation_set(relation_settings=rq)
262+ else:
263+ log("Did not receive most recent broker timestamp but timestamps "
264+ "are available - waiting for next timestamp", level=INFO)
265+
266+ return
267+
268+ log("Ring/builder update available", level=DEBUG)
269+ builders_only = int(rx_settings.get('sync-only-builders', 0))
270 path = os.path.basename(get_www_dir())
271 try:
272 sync_proxy_rings('http://%s/%s' % (broker, path),
273@@ -433,7 +525,7 @@
274 except CalledProcessError:
275 log("Ring builder sync failed, builders not yet available - "
276 "leader not ready?", level=WARNING)
277- return None
278+ return
279
280 # Re-enable the proxy once all builders and rings are synced
281 if fully_synced():
282@@ -449,16 +541,6 @@
283 @hooks.hook('cluster-relation-changed')
284 @pause_aware_restart_on_change(restart_map())
285 def cluster_changed():
286- key = SwiftProxyClusterRPC.KEY_NOTIFY_LEADER_CHANGED
287- leader_changed = relation_get(attribute=key)
288- if leader_changed:
289- log("Leader changed notification received from peer unit. Since this "
290- "most likely occurred during a ring sync proxies will be "
291- "disabled until the leader is restored and a fresh sync request "
292- "is set out", level=WARNING)
293- service_stop("swift-proxy")
294- return
295-
296 if is_elected_leader(SWIFT_HA_RES):
297 cluster_leader_actions()
298 else:
299@@ -466,6 +548,7 @@
300
301
302 @hooks.hook('ha-relation-changed')
303+@sync_builders_and_rings_if_changed
304 def ha_relation_changed():
305 clustered = relation_get('clustered')
306 if clustered:
307
308=== modified file 'lib/swift_utils.py'
309--- lib/swift_utils.py 2015-11-06 18:33:58 +0000
310+++ lib/swift_utils.py 2016-02-18 13:23:54 +0000
311@@ -7,6 +7,7 @@
312 import subprocess
313 import tempfile
314 import threading
315+import time
316 import uuid
317
318 from collections import OrderedDict
319@@ -39,6 +40,7 @@
320 INFO,
321 WARNING,
322 config,
323+ local_unit,
324 relation_get,
325 unit_get,
326 relation_set,
327@@ -180,40 +182,56 @@
328 KEY_STOP_PROXY_SVC = 'stop-proxy-service'
329 KEY_STOP_PROXY_SVC_ACK = 'stop-proxy-service-ack'
330 KEY_NOTIFY_LEADER_CHANGED = 'leader-changed-notification'
331+ KEY_REQUEST_RESYNC = 'resync-request'
332+ KEY_REQUEST_RESYNC_ACK = 'resync-request-ack'
333
334 def __init__(self, version=1):
335 self._version = version
336
337+ @property
338+ def _hostname(self):
339+ hostname = get_hostaddr()
340+ return format_ipv6_addr(hostname) or hostname
341+
342 def template(self):
343 # Everything must be None by default so it gets dropped from the
344 # relation unless we want it to be set.
345 templates = {1: {'trigger': None,
346 'broker-token': None,
347 'builder-broker': None,
348+ 'broker-timestamp': None,
349 self.KEY_STOP_PROXY_SVC: None,
350 self.KEY_STOP_PROXY_SVC_ACK: None,
351 self.KEY_NOTIFY_LEADER_CHANGED: None,
352+ self.KEY_REQUEST_RESYNC: None,
353 'peers-only': None,
354 'sync-only-builders': None}}
355 return copy.deepcopy(templates[self._version])
356
357- def stop_proxy_request(self, peers_only=False):
358+ def stop_proxy_request(self, peers_only=False, token=None):
359 """Request to stop peer proxy service.
360
361- NOTE: leader action
362+ A token can optionally be supplied in case we want to restart a sync
363+ e.g. following a leader change notification.
364+
365+ NOTE: this action must only be performed by the cluster leader.
366 """
367 rq = self.template()
368- rq['trigger'] = str(uuid.uuid4())
369+ if not token:
370+ token = str(uuid.uuid4())
371+
372+ rq['trigger'] = token
373 rq[self.KEY_STOP_PROXY_SVC] = rq['trigger']
374 if peers_only:
375 rq['peers-only'] = 1
376
377+ rq['builder-broker'] = self._hostname
378 return rq
379
380 def stop_proxy_ack(self, echo_token, echo_peers_only):
381 """Ack that peer proxy service is stopped.
382
383- NOTE: non-leader action
384+ NOTE: this action must NOT be performed by the cluster leader.
385 """
386 rq = self.template()
387 rq['trigger'] = str(uuid.uuid4())
388@@ -222,11 +240,10 @@
389 rq['peers-only'] = echo_peers_only
390 return rq
391
392- def sync_rings_request(self, broker_host, broker_token,
393- builders_only=False):
394+ def sync_rings_request(self, broker_token, builders_only=False):
395 """Request for peer to sync rings.
396
397- NOTE: leader action
398+ NOTE: this action must only be performed by the cluster leader.
399 """
400 rq = self.template()
401 rq['trigger'] = str(uuid.uuid4())
402@@ -235,17 +252,34 @@
403 rq['sync-only-builders'] = 1
404
405 rq['broker-token'] = broker_token
406- rq['builder-broker'] = broker_host
407+ rq['broker-timestamp'] = "%f" % time.time()
408+ rq['builder-broker'] = self._hostname
409 return rq
410
411- def notify_leader_changed(self):
412+ def notify_leader_changed(self, token):
413 """Notify peers that leader has changed.
414
415- NOTE: leader action
416- """
417- rq = self.template()
418- rq['trigger'] = str(uuid.uuid4())
419- rq[self.KEY_NOTIFY_LEADER_CHANGED] = rq['trigger']
420+ The token passed in must be that associated with the sync we claim to
421+ have been interrupted. It will be re-used by the restored leader once
422+ it receives this notification.
423+
424+ NOTE: this action must only be performed by the cluster leader that
425+ has relinquished it's leader status as part of the current hook
426+ context.
427+ """
428+ rq = self.template()
429+ rq['trigger'] = str(uuid.uuid4())
430+ rq[self.KEY_NOTIFY_LEADER_CHANGED] = token
431+ return rq
432+
433+ def request_resync(self, token):
434+ """Request re-sync from leader.
435+
436+ NOTE: this action must not be performed by the cluster leader.
437+ """
438+ rq = self.template()
439+ rq['trigger'] = str(uuid.uuid4())
440+ rq[self.KEY_REQUEST_RESYNC] = token
441 return rq
442
443
444@@ -704,29 +738,27 @@
445 return sha.hexdigest()
446
447
448-def get_broker_token():
449- """Get ack token from peers to be used as broker token.
450-
451- Must be equal across all peers.
452-
453- Returns token or None if not found.
454+def non_null_unique(data):
455+ """Return True if data is a list containing all non-null values that are
456+ all equal.
457 """
458- responses = []
459- ack_key = SwiftProxyClusterRPC.KEY_STOP_PROXY_SVC_ACK
460+ return (all(data) and len(set(data)) > 1)
461+
462+
463+def previously_synced():
464+ """If a full sync is not known to have been performed from this unit, False
465+ is returned."""
466+ broker = None
467 for rid in relation_ids('cluster'):
468- for unit in related_units(rid):
469- responses.append(relation_get(attribute=ack_key, rid=rid,
470- unit=unit))
471-
472- # If no acks exist we have probably never done a sync so make up a token
473- if len(responses) == 0:
474- return str(uuid.uuid4())
475-
476- if not all(responses) or len(set(responses)) != 1:
477- log("Not all ack tokens equal - %s" % (responses), level=DEBUG)
478- return None
479-
480- return responses[0]
481+ broker = relation_get(attribute='builder-broker', rid=rid,
482+ unit=local_unit())
483+ only_builders_synced = relation_get(attribute='sync-only-builders',
484+ rid=rid, unit=local_unit())
485+
486+ if broker and not only_builders_synced:
487+ return True
488+
489+ return False
490
491
492 def sync_builders_and_rings_if_changed(f):
493@@ -757,7 +789,8 @@
494 rings_path = os.path.join(SWIFT_CONF_DIR, '*.%s' %
495 (SWIFT_RING_EXT))
496 rings_ready = len(glob.glob(rings_path)) == len(SWIFT_RINGS)
497- rings_changed = rings_after != rings_before
498+ rings_changed = ((rings_after != rings_before) or
499+ not previously_synced())
500 builders_changed = builders_after != builders_before
501 if rings_changed or builders_changed:
502 # Copy builders and rings (if available) to the server dir.
503@@ -766,9 +799,9 @@
504 # Trigger sync
505 cluster_sync_rings(peers_only=not rings_changed)
506 else:
507- cluster_sync_rings(peers_only=True, builders_only=True)
508 log("Rings not ready for sync - syncing builders",
509 level=DEBUG)
510+ cluster_sync_rings(peers_only=True, builders_only=True)
511 else:
512 log("Rings/builders unchanged - skipping sync", level=DEBUG)
513
514@@ -869,20 +902,31 @@
515 "skipping", level=WARNING)
516 return
517
518- hostname = get_hostaddr()
519- hostname = format_ipv6_addr(hostname) or hostname
520+ if not broker_token:
521+ log("No broker token - aborting sync", level=WARNING)
522+ return
523+
524+ cluster_rids = relation_ids('cluster')
525+ if not cluster_rids:
526+ log("Cluster relation not yet available - skipping sync", level=DEBUG)
527+ return
528+
529+ if builders_only:
530+ type = "builders"
531+ else:
532+ type = "builders & rings"
533+
534 # Notify peers that builders are available
535- log("Notifying peer(s) that rings are ready for sync.", level=INFO)
536- rq = SwiftProxyClusterRPC().sync_rings_request(hostname,
537- broker_token,
538+ log("Notifying peer(s) that %s are ready for sync." % type, level=INFO)
539+ rq = SwiftProxyClusterRPC().sync_rings_request(broker_token,
540 builders_only=builders_only)
541- for rid in relation_ids('cluster'):
542+ for rid in cluster_rids:
543 log("Notifying rid=%s (%s)" % (rid, rq), level=DEBUG)
544 relation_set(relation_id=rid, relation_settings=rq)
545
546
547-def broadcast_rings_available(broker_token, peers=True, storage=True,
548- builders_only=False):
549+def broadcast_rings_available(storage=True, builders_only=False,
550+ broker_token=None):
551 """Notify storage relations and cluster (peer) relations that rings and
552 builders are availble for sync.
553
554@@ -895,14 +939,13 @@
555 else:
556 log("Skipping notify storage relations", level=DEBUG)
557
558- if peers:
559- notify_peers_builders_available(broker_token,
560- builders_only=builders_only)
561- else:
562- log("Skipping notify peer relations", level=DEBUG)
563-
564-
565-def cluster_sync_rings(peers_only=False, builders_only=False):
566+ # Always set peer info even if not clustered so that info is present when
567+ # units join
568+ notify_peers_builders_available(broker_token,
569+ builders_only=builders_only)
570+
571+
572+def cluster_sync_rings(peers_only=False, builders_only=False, token=None):
573 """Notify peer relations that they should stop their proxy services.
574
575 Peer units will then be expected to do a relation_set with
576@@ -923,24 +966,21 @@
577 # If we have no peer units just go ahead and broadcast to storage
578 # relations. If we have been instructed to only broadcast to peers this
579 # should do nothing.
580- broker_token = get_broker_token()
581- broadcast_rings_available(broker_token, peers=False,
582+ broadcast_rings_available(broker_token=str(uuid.uuid4()),
583 storage=not peers_only)
584 return
585 elif builders_only:
586+ if not token:
587+ token = str(uuid.uuid4())
588+
589 # No need to stop proxies if only syncing builders between peers.
590- broker_token = get_broker_token()
591- broadcast_rings_available(broker_token, storage=False,
592- builders_only=builders_only)
593+ broadcast_rings_available(storage=False, builders_only=True,
594+ broker_token=token)
595 return
596
597- rel_ids = relation_ids('cluster')
598- trigger = str(uuid.uuid4())
599-
600- log("Sending request to stop proxy service to all peers (%s)" % (trigger),
601- level=INFO)
602- rq = SwiftProxyClusterRPC().stop_proxy_request(peers_only)
603- for rid in rel_ids:
604+ log("Sending request to stop proxy service to all peers", level=INFO)
605+ rq = SwiftProxyClusterRPC().stop_proxy_request(peers_only, token=token)
606+ for rid in relation_ids('cluster'):
607 relation_set(relation_id=rid, relation_settings=rq)
608
609
610@@ -996,6 +1036,34 @@
611 return unit_get('private-address')
612
613
614+def is_most_recent_timestamp(timestamp):
615+ ts = []
616+ for rid in relation_ids('cluster'):
617+ for unit in related_units(rid):
618+ settings = relation_get(rid=rid, unit=unit)
619+ t = settings.get('broker-timestamp')
620+ if t:
621+ ts.append(t)
622+
623+ if not ts or not timestamp:
624+ return False
625+
626+ return timestamp >= max(ts)
627+
628+
629+def timestamps_available(excluded_unit):
630+ for rid in relation_ids('cluster'):
631+ for unit in related_units(rid):
632+ if unit == excluded_unit:
633+ continue
634+
635+ settings = relation_get(rid=rid, unit=unit)
636+ if settings.get('broker-timestamp'):
637+ return True
638+
639+ return False
640+
641+
642 def is_paused(status_get=status_get):
643 """Is the unit paused?"""
644 with HookData()():
645
646=== modified file 'unit_tests/test_swift_hooks.py'
647--- unit_tests/test_swift_hooks.py 2015-11-27 09:48:45 +0000
648+++ unit_tests/test_swift_hooks.py 2016-02-18 13:23:54 +0000
649@@ -8,41 +8,43 @@
650 )
651
652 sys.path.append("hooks")
653-with patch('charmhelpers.core.hookenv.log'):
654- with patch('lib.swift_utils.is_paused') as is_paused:
655- import swift_hooks
656+# swift_hooks.py does a register_configs() on import
657+with patch('charmhelpers.contrib.openstack.utils.apt_cache', lambda: None):
658+ with patch('charmhelpers.core.hookenv.log'):
659+ with patch('lib.swift_utils.is_paused') as is_paused:
660+ import swift_hooks
661
662
663 class SwiftHooksTestCase(unittest.TestCase):
664
665 @patch("swift_hooks.relation_get")
666 @patch("swift_hooks.local_unit")
667- def test_all_peers_stopped(self, mock_local_unit, mock_relation_get):
668+ def test_is_all_peers_stopped(self, mock_local_unit, mock_relation_get):
669 token1 = str(uuid.uuid4())
670 token2 = str(uuid.uuid4())
671 mock_relation_get.return_value = token1
672
673 responses = [{'some-other-key': token1}]
674- self.assertFalse(swift_hooks.all_peers_stopped(responses))
675+ self.assertFalse(swift_hooks.is_all_peers_stopped(responses))
676
677 responses = [{'stop-proxy-service-ack': token1},
678 {'stop-proxy-service-ack': token2}]
679- self.assertFalse(swift_hooks.all_peers_stopped(responses))
680+ self.assertFalse(swift_hooks.is_all_peers_stopped(responses))
681
682 responses = [{'stop-proxy-service-ack': token1},
683 {'stop-proxy-service-ack': token1},
684 {'some-other-key': token1}]
685- self.assertFalse(swift_hooks.all_peers_stopped(responses))
686+ self.assertFalse(swift_hooks.is_all_peers_stopped(responses))
687
688 responses = [{'stop-proxy-service-ack': token1},
689 {'stop-proxy-service-ack': token1}]
690- self.assertTrue(swift_hooks.all_peers_stopped(responses))
691+ self.assertTrue(swift_hooks.is_all_peers_stopped(responses))
692
693 mock_relation_get.return_value = token2
694
695 responses = [{'stop-proxy-service-ack': token1},
696 {'stop-proxy-service-ack': token1}]
697- self.assertFalse(swift_hooks.all_peers_stopped(responses))
698+ self.assertFalse(swift_hooks.is_all_peers_stopped(responses))
699
700 @patch.object(swift_hooks, 'config')
701 @patch('charmhelpers.contrib.openstack.ip.config')
702
703=== modified file 'unit_tests/test_swift_utils.py'
704--- unit_tests/test_swift_utils.py 2015-11-03 13:52:37 +0000
705+++ unit_tests/test_swift_utils.py 2016-02-18 13:23:54 +0000
706@@ -22,7 +22,7 @@
707
708 class SwiftUtilsTestCase(unittest.TestCase):
709
710- @mock.patch('lib.swift_utils.get_broker_token')
711+ @mock.patch.object(swift_utils, 'previously_synced')
712 @mock.patch('lib.swift_utils.update_www_rings')
713 @mock.patch('lib.swift_utils.get_builders_checksum')
714 @mock.patch('lib.swift_utils.get_rings_checksum')
715@@ -38,12 +38,12 @@
716 mock_log, mock_balance_rings,
717 mock_get_rings_checksum,
718 mock_get_builders_checksum, mock_update_www_rings,
719- mock_get_broker_token):
720- mock_get_broker_token.return_value = "token1"
721+ mock_previously_synced):
722
723 # Make sure same is returned for both so that we don't try to sync
724 mock_get_rings_checksum.return_value = None
725 mock_get_builders_checksum.return_value = None
726+ mock_previously_synced.return_value = True
727
728 # Test blocker 1
729 mock_is_elected_leader.return_value = False
730@@ -77,25 +77,23 @@
731 self.assertTrue(mock_set_min_hours.called)
732 self.assertTrue(mock_balance_rings.called)
733
734+ @mock.patch('lib.swift_utils.previously_synced')
735 @mock.patch('lib.swift_utils._load_builder')
736 @mock.patch('lib.swift_utils.initialize_ring')
737- @mock.patch('lib.swift_utils.get_broker_token')
738 @mock.patch('lib.swift_utils.update_www_rings')
739 @mock.patch('lib.swift_utils.get_builders_checksum')
740 @mock.patch('lib.swift_utils.get_rings_checksum')
741 @mock.patch('lib.swift_utils.balance_rings')
742 @mock.patch('lib.swift_utils.log')
743 @mock.patch('lib.swift_utils.is_elected_leader')
744- def test_update_rings_multiple_devs(self,
745- mock_is_elected_leader,
746+ def test_update_rings_multiple_devs(self, mock_is_elected_leader,
747 mock_log, mock_balance_rings,
748 mock_get_rings_checksum,
749 mock_get_builders_checksum,
750 mock_update_www_rings,
751- mock_get_broker_token,
752 mock_initialize_ring,
753 mock_load_builder,
754- ):
755+ mock_previously_synced):
756 # To avoid the need for swift.common.ring library, mock a basic
757 # rings dictionary, keyed by path.
758 # Each ring has enough logic to hold a dictionary with a single 'devs'
759@@ -115,11 +113,13 @@
760
761 def add_dev(self, dev):
762 mock_rings[self.path]['devs'].append(dev)
763+
764 return mock_ring(path)
765
766 def mock_initialize_ring_fn(path, *args):
767 mock_rings.setdefault(path, {'devs': []})
768
769+ mock_is_elected_leader.return_value = True
770 mock_load_builder.side_effect = mock_load_builder_fn
771 mock_initialize_ring.side_effect = mock_initialize_ring_fn
772
773@@ -153,7 +153,6 @@
774 swift_utils.update_rings(nodes)
775 self.assertFalse(mock_add_to_ring.called)
776
777- @mock.patch('lib.swift_utils.get_broker_token')
778 @mock.patch('lib.swift_utils.balance_rings')
779 @mock.patch('lib.swift_utils.log')
780 @mock.patch('lib.swift_utils.is_elected_leader')
781@@ -165,9 +164,7 @@
782 mock_config,
783 mock_is_elected_leader,
784 mock_log,
785- mock_balance_rings,
786- mock_get_broker_token):
787- mock_get_broker_token.return_value = "token1"
788+ mock_balance_rings):
789
790 @swift_utils.sync_builders_and_rings_if_changed
791 def mock_balance():
792@@ -200,6 +197,7 @@
793 finally:
794 shutil.rmtree(tmpdir)
795
796+ @mock.patch.object(swift_utils, 'get_hostaddr', lambda *args: '1.2.3.4')
797 @mock.patch('lib.swift_utils.uuid')
798 def test_cluster_rpc_stop_proxy_request(self, mock_uuid):
799 mock_uuid.uuid4.return_value = 'test-uuid'
800@@ -207,9 +205,11 @@
801 rq = rpc.stop_proxy_request(peers_only=True)
802 self.assertEqual({'trigger': 'test-uuid',
803 'broker-token': None,
804- 'builder-broker': None,
805- 'peers-only': True,
806+ 'broker-timestamp': None,
807+ 'builder-broker': '1.2.3.4',
808+ 'peers-only': 1,
809 'leader-changed-notification': None,
810+ 'resync-request': None,
811 'stop-proxy-service': 'test-uuid',
812 'stop-proxy-service-ack': None,
813 'sync-only-builders': None}, rq)
814@@ -217,13 +217,18 @@
815 rq = rpc.stop_proxy_request()
816 self.assertEqual({'trigger': 'test-uuid',
817 'broker-token': None,
818- 'builder-broker': None,
819+ 'broker-timestamp': None,
820+ 'builder-broker': '1.2.3.4',
821 'peers-only': None,
822 'leader-changed-notification': None,
823+ 'resync-request': None,
824 'stop-proxy-service': 'test-uuid',
825 'stop-proxy-service-ack': None,
826 'sync-only-builders': None}, rq)
827
828+ template_keys = set(rpc.template())
829+ self.assertTrue(set(rq.keys()).issubset(template_keys))
830+
831 @mock.patch('lib.swift_utils.uuid')
832 def test_cluster_rpc_stop_proxy_ack(self, mock_uuid):
833 mock_uuid.uuid4.return_value = 'token2'
834@@ -232,40 +237,58 @@
835 self.assertEqual({'trigger': 'token2',
836 'broker-token': None,
837 'builder-broker': None,
838+ 'broker-timestamp': None,
839 'peers-only': '1',
840 'leader-changed-notification': None,
841+ 'resync-request': None,
842 'stop-proxy-service': None,
843 'stop-proxy-service-ack': 'token1',
844 'sync-only-builders': None}, rq)
845
846+ template_keys = set(rpc.template())
847+ self.assertTrue(set(rq.keys()).issubset(template_keys))
848+
849+ @mock.patch.object(swift_utils, 'get_hostaddr', lambda *args: '1.2.3.4')
850+ @mock.patch.object(swift_utils, 'time')
851 @mock.patch('lib.swift_utils.uuid')
852- def test_cluster_rpc_sync_request(self, mock_uuid):
853+ def test_cluster_rpc_sync_request(self, mock_uuid, mock_time):
854+ mock_time.time = mock.Mock(return_value=float(1.234))
855 mock_uuid.uuid4.return_value = 'token2'
856 rpc = swift_utils.SwiftProxyClusterRPC()
857- rq = rpc.sync_rings_request('HostA', 'token1')
858+ rq = rpc.sync_rings_request('token1')
859 self.assertEqual({'trigger': 'token2',
860 'broker-token': 'token1',
861- 'builder-broker': 'HostA',
862+ 'broker-timestamp': '1.234000',
863+ 'builder-broker': '1.2.3.4',
864 'peers-only': None,
865 'leader-changed-notification': None,
866+ 'resync-request': None,
867 'stop-proxy-service': None,
868 'stop-proxy-service-ack': None,
869 'sync-only-builders': None}, rq)
870
871+ template_keys = set(rpc.template())
872+ self.assertTrue(set(rq.keys()).issubset(template_keys))
873+
874 @mock.patch('lib.swift_utils.uuid')
875 def test_cluster_rpc_notify_leader_changed(self, mock_uuid):
876- mock_uuid.uuid4.return_value = 'token1'
877+ mock_uuid.uuid4.return_value = 'e4b67426-6cc0-4aa3-829d-227999cd0a75'
878 rpc = swift_utils.SwiftProxyClusterRPC()
879- rq = rpc.notify_leader_changed()
880- self.assertEqual({'trigger': 'token1',
881+ rq = rpc.notify_leader_changed('token1')
882+ self.assertEqual({'trigger': 'e4b67426-6cc0-4aa3-829d-227999cd0a75',
883 'broker-token': None,
884 'builder-broker': None,
885+ 'broker-timestamp': None,
886 'peers-only': None,
887 'leader-changed-notification': 'token1',
888 'stop-proxy-service': None,
889 'stop-proxy-service-ack': None,
890+ 'resync-request': None,
891 'sync-only-builders': None}, rq)
892
893+ template_keys = set(rpc.template().keys())
894+ self.assertTrue(set(rq.keys()).issubset(template_keys))
895+
896 def test_all_responses_equal(self):
897 responses = [{'a': 1, 'c': 3}]
898 self.assertTrue(swift_utils.all_responses_equal(responses, 'b',
899@@ -301,6 +324,42 @@
900 rsps = []
901 self.assertIsNone(swift_utils.get_first_available_value(rsps, 'key3'))
902
903+ @mock.patch.object(swift_utils, 'relation_get')
904+ @mock.patch.object(swift_utils, 'related_units', lambda arg: ['proxy/1'])
905+ @mock.patch.object(swift_utils, 'relation_ids', lambda arg: ['cluster:1'])
906+ def test_is_most_recent_timestamp(self, mock_rel_get):
907+ mock_rel_get.return_value = {'broker-timestamp': '1111'}
908+ self.assertTrue(swift_utils.is_most_recent_timestamp('1234'))
909+ mock_rel_get.return_value = {'broker-timestamp': '2234'}
910+ self.assertFalse(swift_utils.is_most_recent_timestamp('1234'))
911+ mock_rel_get.return_value = {}
912+ self.assertFalse(swift_utils.is_most_recent_timestamp('1234'))
913+ mock_rel_get.return_value = {'broker-timestamp': '2234'}
914+ self.assertFalse(swift_utils.is_most_recent_timestamp(None))
915+
916+ @mock.patch.object(swift_utils, 'relation_get')
917+ @mock.patch.object(swift_utils, 'related_units', lambda arg: ['proxy/1'])
918+ @mock.patch.object(swift_utils, 'relation_ids', lambda arg: ['cluster:1'])
919+ def test_timestamps_available(self, mock_rel_get):
920+ mock_rel_get.return_value = {}
921+ self.assertFalse(swift_utils.timestamps_available('proxy/1'))
922+ mock_rel_get.return_value = {'broker-timestamp': '1234'}
923+ self.assertFalse(swift_utils.timestamps_available('proxy/1'))
924+ mock_rel_get.return_value = {'broker-timestamp': '1234'}
925+ self.assertTrue(swift_utils.timestamps_available('proxy/2'))
926+
927+ def _test_is_paused_unknown(self):
928+ fake_status_get = lambda: ("unknown", "")
929+ self.assertFalse(swift_utils.is_paused(status_get=fake_status_get))
930+
931+ def _test_is_paused_paused(self):
932+ fake_status_get = lambda: ("maintenance", "Paused")
933+ self.assertTrue(swift_utils.is_paused(status_get=fake_status_get))
934+
935+ def _test_is_paused_other_maintenance(self):
936+ fake_status_get = lambda: ("maintenance", "Hook")
937+ self.assertFalse(swift_utils.is_paused(status_get=fake_status_get))
938+
939 @mock.patch('lib.swift_utils.is_paused')
940 @mock.patch('lib.swift_utils.config')
941 @mock.patch('lib.swift_utils.set_os_workload_status')

Subscribers

People subscribed via source and target branches