Merge lp:~chad.smith/charms/precise/keystone/ha-support into lp:~openstack-charmers/charms/precise/keystone/ha-support

Proposed by Chad Smith
Status: Merged
Merged at revision: 55
Proposed branch: lp:~chad.smith/charms/precise/keystone/ha-support
Merge into: lp:~openstack-charmers/charms/precise/keystone/ha-support
Diff against target: 91 lines (+51/-0)
6 files modified
hooks/keystone-hooks (+5/-0)
hooks/lib/openstack_common.py (+16/-0)
scripts/add_to_cluster (+2/-0)
scripts/health_checks.d/service_ports_live (+13/-0)
scripts/health_checks.d/service_running (+13/-0)
scripts/remove_from_cluster (+2/-0)
To merge this branch: bzr merge lp:~chad.smith/charms/precise/keystone/ha-support
Reviewer Review Type Date Requested Status
Adam Gandelman Pending
Review via email: mp+150137@code.launchpad.net

This proposal supersedes a proposal from 2013-02-21.

Description of the change

This is a merge proposal to establish a potential template for health_script.d, add_to_cluster and remove_from_cluster delivery within the ha-supported openstack charms.

This is intended as a talking point and guidance for what landscape-client will expect on openstack charms supporting an HA services. If there is anything that doesn't look acceptable, I'm game for a change but wanted to pull a template together to seed discussions. Any suggestions about better charm-related ways.

These script locations will be hard-coded within landscape-client and will be run during rolling Openstack upgrades. So, prior to ODS, we'd like to make sure this process makes sense for most haclustered services.

The example of landscape-client's interaction with charm scripts is our HAServiceManager plugin at https://code.launchpad.net/~chad.smith/landscape-client/ha-manager-skeleton/+merge/148593

1. Before any packages are upgraded, landscape-client will only call /var/lib/juju/units/<unit_name>/charm/remove_from_cluster.
    - remove_from_cluster should will place the node in crm standby state, thereby migrating any active HA resources off of the current node.
2. landscape-client will upgrade any packages and any additional package dependencies will be installed.
3. Upon successful upgrade (maybe involving a reboot), landscape-client will run all run-parts health scripts from /var/lib/juju/units/<unit_name>/charm/health_scripts.d. These health scripts will need to return success 0 or fail (non-zero) exit codes to indicate any issues.
4. If all health scripts succeed, landscape-client will run /var/lib/juju/units/<unit_name>/charm/add_to_cluster to bring the node online in the HA cluster again.

  As James mentioned in email, remove_from_cluster script might take more action to cleanly remove an API service from openstack schedulers or configs. If a charm's remove_from_cluster does more than just migrate HA resources away from the node, then we'll need to discuss how to potentially decompose add_to_cluster functionality into separate scripts. One script that allows us to ensure a local openstack API is running locally and can be "health checked" and a separate script add_to_cluster which only finalizes and activates configuration in the HA service.

To post a comment you must log in.
Revision history for this message
Adam Gandelman (gandelman-a) wrote : Posted in a previous version of this proposal

Hey Chad-

You targeted this merge into the upstream charm @ lp:charms/keystone, can you retarget toward our WIP HA branch @ lp:~openstack-charmers/charms/precise/keystone/ha-support ?

The general approach here LGTM. But I'd love for things to be made a bit more generic so we can just drop this work into other charms unmodified:

- Perhaps save_script_rc() can to be moved to lib/openstack_common.py without the hard-coded KEYSTONE_SERVICE_NAME, and the caller can add that to the kwargs instead?

- I imagine there are some tests that would be common across all charms (service_running, service_ports_live) It would be great if those can be synced to other charms unmodified as well. Eg, perhaps have the ports look for *_OPENSTACK_PORT in the environment and ensure each one? The services check can determine what it should check based on *_OPENSTACK_SERVICE_NAME? Not sure if these common tests should live in some directory other than the tests specific to each service. /var/lib/juju/units/$unit/healthchecks/common.d/ & /var/lib/juju/units/$unit/healthchecks/$unit.d/ ??

End goal for me is to have the common framework for the health checking live with the other common code we maintain, and we can just sync it to other charms when we get it right in one. Of course this is complicated by the combination of bash + python charms we currently maintain, but still doable I think.

Revision history for this message
Chad Smith (chad.smith) wrote : Posted in a previous version of this proposal

Good points Adam. I'll make save_script_rc more generic and the health scripts will be a bit smarter to look for *OPENSTACK* vars where necessary.

I'm pulled onto something else at the moment, but I'll repost this review against the right source branch and get a change to you either today or tomorrow morning.

Revision history for this message
Adam Gandelman (gandelman-a) wrote :

Chad-
Thanks. Lookgs good, but could we move health_checks.d, add_to_cluster, remove_from-cluster and scriptrc out of the charm root directory and into a $CHAMR_DIR/scripts/ directory? Just to keep it clean. I'd do it myself post-merge but dont know how that would affect the tooling you're working on around this stuff.

57. By Chad Smith

move add_cluster, remove_from_cluster and health_checks.d to a new scripts subdir

58. By Chad Smith

more strict netstat port matching

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'hooks/keystone-hooks'
--- hooks/keystone-hooks 2013-02-15 17:23:41 +0000
+++ hooks/keystone-hooks 2013-03-05 22:14:21 +0000
@@ -285,6 +285,11 @@
285 get_os_version_codename(available) > get_os_version_codename(installed)):285 get_os_version_codename(available) > get_os_version_codename(installed)):
286 do_openstack_upgrade(config['openstack-origin'], packages)286 do_openstack_upgrade(config['openstack-origin'], packages)
287287
288 env_vars = {'OPENSTACK_SERVICE_KEYSTONE': 'keystone',
289 'OPENSTACK_PORT_ADMIN': config['admin-port'],
290 'OPENSTACK_PORT_PUBLIC': config['service-port']}
291 save_script_rc(**env_vars)
292
288 set_admin_token(config['admin-token'])293 set_admin_token(config['admin-token'])
289294
290 if eligible_leader():295 if eligible_leader():
291296
=== modified file 'hooks/lib/openstack_common.py'
--- hooks/lib/openstack_common.py 2013-03-05 00:58:23 +0000
+++ hooks/lib/openstack_common.py 2013-03-05 22:14:21 +0000
@@ -245,3 +245,19 @@
245 f.write(template.render(context))245 f.write(template.render(context))
246 with open(HAPROXY_DEFAULT, 'w') as f:246 with open(HAPROXY_DEFAULT, 'w') as f:
247 f.write('ENABLED=1')247 f.write('ENABLED=1')
248
249def save_script_rc(script_path="scripts/scriptrc", **env_vars):
250 """
251 Write an rc file in the charm-delivered directory containing
252 exported environment variables provided by env_vars. Any charm scripts run
253 outside the juju hook environment can source this scriptrc to obtain
254 updated config information necessary to perform health checks or
255 service changes.
256 """
257 unit_name = os.getenv('JUJU_UNIT_NAME').replace('/', '-')
258 juju_rc_path="/var/lib/juju/units/%s/charm/%s" % (unit_name, script_path)
259 with open(juju_rc_path, 'wb') as rc_script:
260 rc_script.write(
261 "#!/bin/bash\n")
262 [rc_script.write('export %s=%s\n' % (u, p))
263 for u, p in env_vars.iteritems() if u != "script_path"]
248264
=== added directory 'scripts'
=== added file 'scripts/add_to_cluster'
--- scripts/add_to_cluster 1970-01-01 00:00:00 +0000
+++ scripts/add_to_cluster 2013-03-05 22:14:21 +0000
@@ -0,0 +1,2 @@
1#!/bin/bash
2crm node online
03
=== added directory 'scripts/health_checks.d'
=== added file 'scripts/health_checks.d/service_ports_live'
--- scripts/health_checks.d/service_ports_live 1970-01-01 00:00:00 +0000
+++ scripts/health_checks.d/service_ports_live 2013-03-05 22:14:21 +0000
@@ -0,0 +1,13 @@
1#!/bin/bash
2# Validate that service ports are active
3HEALTH_DIR=`dirname $0`
4SCRIPTS_DIR=`dirname $HEALTH_DIR`
5. $SCRIPTS_DIR/scriptrc
6set -e
7
8# Grab any OPENSTACK_PORT* environment variables
9openstack_ports=`env| awk -F '=' '(/OPENSTACK_PORT/){print $2}'`
10for port in $openstack_ports
11do
12 netstat -ln | grep -q ":$port "
13done
014
=== added file 'scripts/health_checks.d/service_running'
--- scripts/health_checks.d/service_running 1970-01-01 00:00:00 +0000
+++ scripts/health_checks.d/service_running 2013-03-05 22:14:21 +0000
@@ -0,0 +1,13 @@
1#!/bin/bash
2# Validate that service is running
3HEALTH_DIR=`dirname $0`
4SCRIPTS_DIR=`dirname $HEALTH_DIR`
5. $SCRIPTS_DIR/scriptrc
6set -e
7
8# Grab any OPENSTACK_SERVICE* environment variables
9openstack_service_names=`env| awk -F '=' '(/OPENSTACK_SERVICE/){print $2}'`
10for service_name in $openstack_service_names
11do
12 service $service_name status 2>/dev/null | grep -q running
13done
014
=== added file 'scripts/remove_from_cluster'
--- scripts/remove_from_cluster 1970-01-01 00:00:00 +0000
+++ scripts/remove_from_cluster 2013-03-05 22:14:21 +0000
@@ -0,0 +1,2 @@
1#!/bin/bash
2crm node standby

Subscribers

People subscribed via source and target branches