Merge lp:~chad.smith/charms/precise/keystone/ha-support into lp:~openstack-charmers/charms/precise/keystone/ha-support

Proposed by Chad Smith on 2013-02-22
Status: Merged
Merged at revision: 55
Proposed branch: lp:~chad.smith/charms/precise/keystone/ha-support
Merge into: lp:~openstack-charmers/charms/precise/keystone/ha-support
Diff against target: 91 lines (+51/-0)
6 files modified
hooks/keystone-hooks (+5/-0)
hooks/lib/openstack_common.py (+16/-0)
scripts/add_to_cluster (+2/-0)
scripts/health_checks.d/service_ports_live (+13/-0)
scripts/health_checks.d/service_running (+13/-0)
scripts/remove_from_cluster (+2/-0)
To merge this branch: bzr merge lp:~chad.smith/charms/precise/keystone/ha-support
Reviewer Review Type Date Requested Status
Adam Gandelman 2013-02-22 Pending
Review via email: mp+150137@code.launchpad.net

This proposal supersedes a proposal from 2013-02-21.

Description of the change

This is a merge proposal to establish a potential template for health_script.d, add_to_cluster and remove_from_cluster delivery within the ha-supported openstack charms.

This is intended as a talking point and guidance for what landscape-client will expect on openstack charms supporting an HA services. If there is anything that doesn't look acceptable, I'm game for a change but wanted to pull a template together to seed discussions. Any suggestions about better charm-related ways.

These script locations will be hard-coded within landscape-client and will be run during rolling Openstack upgrades. So, prior to ODS, we'd like to make sure this process makes sense for most haclustered services.

The example of landscape-client's interaction with charm scripts is our HAServiceManager plugin at https://code.launchpad.net/~chad.smith/landscape-client/ha-manager-skeleton/+merge/148593

1. Before any packages are upgraded, landscape-client will only call /var/lib/juju/units/<unit_name>/charm/remove_from_cluster.
    - remove_from_cluster should will place the node in crm standby state, thereby migrating any active HA resources off of the current node.
2. landscape-client will upgrade any packages and any additional package dependencies will be installed.
3. Upon successful upgrade (maybe involving a reboot), landscape-client will run all run-parts health scripts from /var/lib/juju/units/<unit_name>/charm/health_scripts.d. These health scripts will need to return success 0 or fail (non-zero) exit codes to indicate any issues.
4. If all health scripts succeed, landscape-client will run /var/lib/juju/units/<unit_name>/charm/add_to_cluster to bring the node online in the HA cluster again.

  As James mentioned in email, remove_from_cluster script might take more action to cleanly remove an API service from openstack schedulers or configs. If a charm's remove_from_cluster does more than just migrate HA resources away from the node, then we'll need to discuss how to potentially decompose add_to_cluster functionality into separate scripts. One script that allows us to ensure a local openstack API is running locally and can be "health checked" and a separate script add_to_cluster which only finalizes and activates configuration in the HA service.

To post a comment you must log in.
Adam Gandelman (gandelman-a) wrote : Posted in a previous version of this proposal

Hey Chad-

You targeted this merge into the upstream charm @ lp:charms/keystone, can you retarget toward our WIP HA branch @ lp:~openstack-charmers/charms/precise/keystone/ha-support ?

The general approach here LGTM. But I'd love for things to be made a bit more generic so we can just drop this work into other charms unmodified:

- Perhaps save_script_rc() can to be moved to lib/openstack_common.py without the hard-coded KEYSTONE_SERVICE_NAME, and the caller can add that to the kwargs instead?

- I imagine there are some tests that would be common across all charms (service_running, service_ports_live) It would be great if those can be synced to other charms unmodified as well. Eg, perhaps have the ports look for *_OPENSTACK_PORT in the environment and ensure each one? The services check can determine what it should check based on *_OPENSTACK_SERVICE_NAME? Not sure if these common tests should live in some directory other than the tests specific to each service. /var/lib/juju/units/$unit/healthchecks/common.d/ & /var/lib/juju/units/$unit/healthchecks/$unit.d/ ??

End goal for me is to have the common framework for the health checking live with the other common code we maintain, and we can just sync it to other charms when we get it right in one. Of course this is complicated by the combination of bash + python charms we currently maintain, but still doable I think.

Chad Smith (chad.smith) wrote : Posted in a previous version of this proposal

Good points Adam. I'll make save_script_rc more generic and the health scripts will be a bit smarter to look for *OPENSTACK* vars where necessary.

I'm pulled onto something else at the moment, but I'll repost this review against the right source branch and get a change to you either today or tomorrow morning.

Adam Gandelman (gandelman-a) wrote :

Chad-
Thanks. Lookgs good, but could we move health_checks.d, add_to_cluster, remove_from-cluster and scriptrc out of the charm root directory and into a $CHAMR_DIR/scripts/ directory? Just to keep it clean. I'd do it myself post-merge but dont know how that would affect the tooling you're working on around this stuff.

57. By Chad Smith on 2013-02-22

move add_cluster, remove_from_cluster and health_checks.d to a new scripts subdir

58. By Chad Smith on 2013-03-05

more strict netstat port matching

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'hooks/keystone-hooks'
2--- hooks/keystone-hooks 2013-02-15 17:23:41 +0000
3+++ hooks/keystone-hooks 2013-03-05 22:14:21 +0000
4@@ -285,6 +285,11 @@
5 get_os_version_codename(available) > get_os_version_codename(installed)):
6 do_openstack_upgrade(config['openstack-origin'], packages)
7
8+ env_vars = {'OPENSTACK_SERVICE_KEYSTONE': 'keystone',
9+ 'OPENSTACK_PORT_ADMIN': config['admin-port'],
10+ 'OPENSTACK_PORT_PUBLIC': config['service-port']}
11+ save_script_rc(**env_vars)
12+
13 set_admin_token(config['admin-token'])
14
15 if eligible_leader():
16
17=== modified file 'hooks/lib/openstack_common.py'
18--- hooks/lib/openstack_common.py 2013-03-05 00:58:23 +0000
19+++ hooks/lib/openstack_common.py 2013-03-05 22:14:21 +0000
20@@ -245,3 +245,19 @@
21 f.write(template.render(context))
22 with open(HAPROXY_DEFAULT, 'w') as f:
23 f.write('ENABLED=1')
24+
25+def save_script_rc(script_path="scripts/scriptrc", **env_vars):
26+ """
27+ Write an rc file in the charm-delivered directory containing
28+ exported environment variables provided by env_vars. Any charm scripts run
29+ outside the juju hook environment can source this scriptrc to obtain
30+ updated config information necessary to perform health checks or
31+ service changes.
32+ """
33+ unit_name = os.getenv('JUJU_UNIT_NAME').replace('/', '-')
34+ juju_rc_path="/var/lib/juju/units/%s/charm/%s" % (unit_name, script_path)
35+ with open(juju_rc_path, 'wb') as rc_script:
36+ rc_script.write(
37+ "#!/bin/bash\n")
38+ [rc_script.write('export %s=%s\n' % (u, p))
39+ for u, p in env_vars.iteritems() if u != "script_path"]
40
41=== added directory 'scripts'
42=== added file 'scripts/add_to_cluster'
43--- scripts/add_to_cluster 1970-01-01 00:00:00 +0000
44+++ scripts/add_to_cluster 2013-03-05 22:14:21 +0000
45@@ -0,0 +1,2 @@
46+#!/bin/bash
47+crm node online
48
49=== added directory 'scripts/health_checks.d'
50=== added file 'scripts/health_checks.d/service_ports_live'
51--- scripts/health_checks.d/service_ports_live 1970-01-01 00:00:00 +0000
52+++ scripts/health_checks.d/service_ports_live 2013-03-05 22:14:21 +0000
53@@ -0,0 +1,13 @@
54+#!/bin/bash
55+# Validate that service ports are active
56+HEALTH_DIR=`dirname $0`
57+SCRIPTS_DIR=`dirname $HEALTH_DIR`
58+. $SCRIPTS_DIR/scriptrc
59+set -e
60+
61+# Grab any OPENSTACK_PORT* environment variables
62+openstack_ports=`env| awk -F '=' '(/OPENSTACK_PORT/){print $2}'`
63+for port in $openstack_ports
64+do
65+ netstat -ln | grep -q ":$port "
66+done
67
68=== added file 'scripts/health_checks.d/service_running'
69--- scripts/health_checks.d/service_running 1970-01-01 00:00:00 +0000
70+++ scripts/health_checks.d/service_running 2013-03-05 22:14:21 +0000
71@@ -0,0 +1,13 @@
72+#!/bin/bash
73+# Validate that service is running
74+HEALTH_DIR=`dirname $0`
75+SCRIPTS_DIR=`dirname $HEALTH_DIR`
76+. $SCRIPTS_DIR/scriptrc
77+set -e
78+
79+# Grab any OPENSTACK_SERVICE* environment variables
80+openstack_service_names=`env| awk -F '=' '(/OPENSTACK_SERVICE/){print $2}'`
81+for service_name in $openstack_service_names
82+do
83+ service $service_name status 2>/dev/null | grep -q running
84+done
85
86=== added file 'scripts/remove_from_cluster'
87--- scripts/remove_from_cluster 1970-01-01 00:00:00 +0000
88+++ scripts/remove_from_cluster 2013-03-05 22:14:21 +0000
89@@ -0,0 +1,2 @@
90+#!/bin/bash
91+crm node standby

Subscribers

People subscribed via source and target branches