Merge lp:~roadmr/charms/trusty/elasticsearch/es2-charm-retry-harder into lp:~onlineservices-charmers/charms/trusty/elasticsearch/elasticsearch2

Proposed by Daniel Manrique
Status: Needs review
Proposed branch: lp:~roadmr/charms/trusty/elasticsearch/es2-charm-retry-harder
Merge into: lp:~onlineservices-charmers/charms/trusty/elasticsearch/elasticsearch2
Diff against target: 16 lines (+5/-1)
1 file modified
tasks/peer-relations.yml (+5/-1)
To merge this branch: bzr merge lp:~roadmr/charms/trusty/elasticsearch/es2-charm-retry-harder
Reviewer Review Type Date Requested Status
Canonical Onlineservices charmers Pending
Review via email: mp+319497@code.launchpad.net

Commit message

Retry a few times while checking if unit joined the cluster.

Instead of just pausing for 30 seconds, which in practice sometimes
doesn't give enough time for the new unit to join the cluster,
poke the cluster health every 15 seconds, a maximum of 6 times (total
wait time: 90 seconds) but continue if we see a successful join at any
retry. Should be more responsive to early joins while more resilient to slow
joins.

Description of the change

Retry a few times while checking if unit joined the cluster.

Instead of just pausing for 30 seconds, which in practice sometimes
doesn't give enough time for the new unit to join the cluster,
poke the cluster health every 15 seconds, a maximum of 6 times (total
wait time: 90 seconds) but continue if we see a successful join at any
retry. Should be more responsive to early joins while more resilient to slow
joins.

To post a comment you must log in.

Unmerged revisions

46. By Daniel Manrique

Retry a few times while checking if unit joined the cluster.

Instead of just pausing for 30 seconds, which in practice sometimes
doesn't give enough time for the new unit to join the cluster,
poke the cluster health every 15 seconds, a maximum of 6 times (total
wait time: 90 seconds) but continue if we see a successful join at any
retry. Should be more responsive to early joins while more resilient to slow
joins.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'tasks/peer-relations.yml'
--- tasks/peer-relations.yml 2015-10-27 22:30:30 +0000
+++ tasks/peer-relations.yml 2017-03-09 20:57:09 +0000
@@ -43,7 +43,11 @@
43- name: Pause to ensure that after restart unit has time to join.43- name: Pause to ensure that after restart unit has time to join.
44 tags:44 tags:
45 - peer-relation-changed45 - peer-relation-changed
46 pause: seconds=3046 uri: url=http://localhost:9200/_cluster/health return_content=yes
47 register: cluster_health_wait
48 retries: 6
49 delay: 15
50 until: cluster_health_wait.json.number_of_nodes > 1
47 when: cluster_health.json.number_of_nodes == 151 when: cluster_health.json.number_of_nodes == 1
4852
49- name: Record cluster health after restart53- name: Record cluster health after restart

Subscribers

People subscribed via source and target branches