Charm Helpers

Merge lp:~1chb1n/charm-helpers/amulet-rmq-helpers into lp:charm-helpers

amulet-rmq-helpers
Merge into devel

Proposed by Ryan Beisner on 2015-08-12

Status:	Merged
Merged at revision:	445
Proposed branch:	lp:~1chb1n/charm-helpers/amulet-rmq-helpers
Merge into:	lp:charm-helpers
Diff against target:	827 lines (+613/-57) 3 files modified charmhelpers/contrib/amulet/utils.py (+234/-52) charmhelpers/contrib/openstack/amulet/deployment.py (+20/-5) charmhelpers/contrib/openstack/amulet/utils.py (+359/-0)
To merge this branch:	bzr merge lp:~1chb1n/charm-helpers/amulet-rmq-helpers
Related bugs:	Link a bug report

Reviewer	Date Requested	Status
Liam Young (community)	2015-08-12	Approve on 2015-09-09
David Ames (community)	2015-09-01	Approve on 2015-09-01
Review via email: mp+267859@code.launchpad.net

Description of the change

Add amulet & openstack/amulet helpers for rabbitmq-server tests; resolve misc. race conditions in amulet helpers.

Add exception handling to the retry logic in service_restarted_since re: bug 1474030.

Add deprecation WARN to the service_restarted amulet helper.

Update _determine_branch_locations, add a new force_series_current mechanism. ie. Always use trusty/nrpe instead of precise/nrpe, even when on series precise.

Note re: validate_sectionless_conf - plucked and generalized from gnuoy's openstack-dashboard amulet test. rabbitmq confs have a similar need. Will update the openstack-dashboard test to use this helper once this lands.

This charmhelpers proposal is pre-requisite for the refactored rmq tests:
https://code.launchpad.net/~1chb1n/charms/trusty/rabbitmq-server/amulet-refactor-1508

lp:~1chb1n/charm-helpers/amulet-rmq-helpers updated on 2015-09-01

440. By Ryan Beisner on 2015-09-01: lint cleanup

Revision history for this message

David Ames (thedac) wrote on 2015-09-01:

Just a few comments in-line.

review: Needs Fixing

lp:~1chb1n/charm-helpers/amulet-rmq-helpers updated on 2015-09-01

441. By Ryan Beisner on 2015-09-01: clarify comment, fix typo, update add_rmq_test_user per review

Revision history for this message

Ryan Beisner (1chb1n) wrote on 2015-09-01:

Thank you for your review. Items addressed and pushed.

Revision history for this message

David Ames (thedac) wrote on 2015-09-01:

Looks good to me.

Good work.

review: Approve

lp:~1chb1n/charm-helpers/amulet-rmq-helpers updated on 2015-09-02

442. By Ryan Beisner on 2015-09-02: re-merge lp:~1chb1n/charm-helpers/amulet-svc-restart-race for updates

Revision history for this message

Liam Young (gnuoy) wrote on 2015-09-09:

Approve

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Corey Bryant

Nobuto Murata

Ryan Beisner

Stuart Bishop

Yoshi Kadokawa

charmers

james beedy

 === modified file 'charmhelpers/contrib/amulet/utils.py'
 --- charmhelpers/contrib/amulet/utils.py	2015-08-17 10:47:36 +0000
 +++ charmhelpers/contrib/amulet/utils.py	2015-09-02 01:36:03 +0000
@@ -19,9 +19,11 @@
  import logging
  import os
  import re
++import socket
  import subprocess
  import sys
  import time
++import uuid
  import amulet
  import distro_info
@@ -114,7 +116,7 @@
          # /!\ DEPRECATION WARNING (beisner):
          # New and existing tests should be rewritten to use
          # validate_services_by_name() as it is aware of init systems.
--        self.log.warn('/!\\ DEPRECATION WARNING:  use '
++        self.log.warn('DEPRECATION WARNING:  use '
                        'validate_services_by_name instead of validate_services '
                        'due to init system differences.')
@@ -269,33 +271,52 @@
          """Get last modification time of directory."""
          return sentry_unit.directory_stat(directory)['mtime']
--    def _get_proc_start_time(self, sentry_unit, service, pgrep_full=False):
--        """Get process' start time.
--
--           Determine start time of the process based on the last modification
--           time of the /proc/pid directory. If pgrep_full is True, the process
--           name is matched against the full command line.
--           """
--        if pgrep_full:
--            cmd = 'pgrep -o -f {}'.format(service)
--        else:
--            cmd = 'pgrep -o {}'.format(service)
--        cmd = cmd + '  | grep  -v pgrep || exit 0'
--        cmd_out = sentry_unit.run(cmd)
--        self.log.debug('CMDout: ' + str(cmd_out))
--        if cmd_out[0]:
--            self.log.debug('Pid for %s %s' % (service, str(cmd_out[0])))
--            proc_dir = '/proc/{}'.format(cmd_out[0].strip())
--            return self._get_dir_mtime(sentry_unit, proc_dir)
++    def _get_proc_start_time(self, sentry_unit, service, pgrep_full=None):
++        """Get start time of a process based on the last modification time
++           of the /proc/pid directory.
++
++        :sentry_unit:  The sentry unit to check for the service on
++        :service:  service name to look for in process table
++        :pgrep_full:  [Deprecated] Use full command line search mode with pgrep
++        :returns:  epoch time of service process start
++        :param commands:  list of bash commands
++        :param sentry_units:  list of sentry unit pointers
++        :returns:  None if successful; Failure message otherwise
++        """
++        if pgrep_full is not None:
++            # /!\ DEPRECATION WARNING (beisner):
++            # No longer implemented, as pidof is now used instead of pgrep.
++            # https://bugs.launchpad.net/charm-helpers/+bug/1474030
++            self.log.warn('DEPRECATION WARNING:  pgrep_full bool is no '
++                          'longer implemented re: lp 1474030.')
++
++        pid_list = self.get_process_id_list(sentry_unit, service)
++        pid = pid_list[0]
++        proc_dir = '/proc/{}'.format(pid)
++        self.log.debug('Pid for {} on {}: {}'.format(
++            service, sentry_unit.info['unit_name'], pid))
++
++        return self._get_dir_mtime(sentry_unit, proc_dir)
      def service_restarted(self, sentry_unit, service, filename,
--                          pgrep_full=False, sleep_time=20):
++                          pgrep_full=None, sleep_time=20):
          """Check if service was restarted.
             Compare a service's start time vs a file's last modification time
             (such as a config file for that service) to determine if the service
             has been restarted.
             """
++        # /!\ DEPRECATION WARNING (beisner):
++        # This method is prone to races in that no before-time is known.
++        # Use validate_service_config_changed instead.
++
++        # NOTE(beisner) pgrep_full is no longer implemented, as pidof is now
++        # used instead of pgrep.  pgrep_full is still passed through to ensure
++        # deprecation WARNS.  lp1474030
++        self.log.warn('DEPRECATION WARNING:  use '
++                      'validate_service_config_changed instead of '
++                      'service_restarted due to known races.')
++
          time.sleep(sleep_time)
          if (self._get_proc_start_time(sentry_unit, service, pgrep_full) >=
                  self._get_file_mtime(sentry_unit, filename)):
@@ -304,15 +325,15 @@
              return False
      def service_restarted_since(self, sentry_unit, mtime, service,
--                                pgrep_full=False, sleep_time=20,
--                                retry_count=2):
++                                pgrep_full=None, sleep_time=20,
++                                retry_count=2, retry_sleep_time=30):
          """Check if service was been started after a given time.
          Args:
            sentry_unit (sentry): The sentry unit to check for the service on
            mtime (float): The epoch time to check against
            service (string): service name to look for in process table
--          pgrep_full (boolean): Use full command line search mode with pgrep
++          pgrep_full: [Deprecated] Use full command line search mode with pgrep
            sleep_time (int): Seconds to sleep before looking for process
            retry_count (int): If service is not found, how many times to retry
@@ -321,30 +342,44 @@
                  False if service is older than mtime or if service was
                  not found.
          """
--        self.log.debug('Checking %s restarted since %s' % (service, mtime))
++        # NOTE(beisner) pgrep_full is no longer implemented, as pidof is now
++        # used instead of pgrep.  pgrep_full is still passed through to ensure
++        # deprecation WARNS.  lp1474030
++
++        unit_name = sentry_unit.info['unit_name']
++        self.log.debug('Checking that %s service restarted since %s on '
++                       '%s' % (service, mtime, unit_name))
          time.sleep(sleep_time)
--        proc_start_time = self._get_proc_start_time(sentry_unit, service,
--                                                    pgrep_full)
--        while retry_count > 0 and not proc_start_time:
--            self.log.debug('No pid file found for service %s, will retry %i '
--                           'more times' % (service, retry_count))
--            time.sleep(30)
--            proc_start_time = self._get_proc_start_time(sentry_unit, service,
--                                                        pgrep_full)
--            retry_count = retry_count - 1
++        proc_start_time = None
++        tries = 0
++        while tries <= retry_count and not proc_start_time:
++            try:
++                proc_start_time = self._get_proc_start_time(sentry_unit,
++                                                            service,
++                                                            pgrep_full)
++                self.log.debug('Attempt {} to get {} proc start time on {} '
++                               'OK'.format(tries, service, unit_name))
++            except IOError:
++                # NOTE(beisner) - race avoidance, proc may not exist yet.
++                # https://bugs.launchpad.net/charm-helpers/+bug/1474030
++                self.log.debug('Attempt {} to get {} proc start time on {} '
++                               'failed'.format(tries, service, unit_name))
++                time.sleep(retry_sleep_time)
++                tries += 1
          if not proc_start_time:
              self.log.warn('No proc start time found, assuming service did '
                            'not start')
              return False
          if proc_start_time >= mtime:
--            self.log.debug('proc start time is newer than provided mtime'
--                           '(%s >= %s)' % (proc_start_time, mtime))
++            self.log.debug('Proc start time is newer than provided mtime'
++                           '(%s >= %s) on %s (OK)' % (proc_start_time,
++                                                      mtime, unit_name))
              return True
          else:
--            self.log.warn('proc start time (%s) is older than provided mtime '
--                          '(%s), service did not restart' % (proc_start_time,
--                                                             mtime))
++            self.log.warn('Proc start time (%s) is older than provided mtime '
++                          '(%s) on %s, service did not '
++                          'restart' % (proc_start_time, mtime, unit_name))
              return False
      def config_updated_since(self, sentry_unit, filename, mtime,
@@ -374,8 +409,9 @@
              return False
      def validate_service_config_changed(self, sentry_unit, mtime, service,
--                                        filename, pgrep_full=False,
--                                        sleep_time=20, retry_count=2):
++                                        filename, pgrep_full=None,
++                                        sleep_time=20, retry_count=2,
++                                        retry_sleep_time=30):
          """Check service and file were updated after mtime
          Args:
@@ -383,9 +419,10 @@
            mtime (float): The epoch time to check against
            service (string): service name to look for in process table
            filename (string): The file to check mtime of
--          pgrep_full (boolean): Use full command line search mode with pgrep
--          sleep_time (int): Seconds to sleep before looking for process
++          pgrep_full: [Deprecated] Use full command line search mode with pgrep
++          sleep_time (int): Initial sleep in seconds to pass to test helpers
            retry_count (int): If service is not found, how many times to retry
++          retry_sleep_time (int): Time in seconds to wait between retries
          Typical Usage:
              u = OpenStackAmuletUtils(ERROR)
@@ -402,15 +439,25 @@
                  mtime, False if service is older than mtime or if service was
                  not found or if filename was modified before mtime.
          """
--        self.log.debug('Checking %s restarted since %s' % (service, mtime))
--        time.sleep(sleep_time)
--        service_restart = self.service_restarted_since(sentry_unit, mtime,
--                                                       service,
--                                                       pgrep_full=pgrep_full,
--                                                       sleep_time=0,
--                                                       retry_count=retry_count)
--        config_update = self.config_updated_since(sentry_unit, filename, mtime,
--                                                  sleep_time=0)
++
++        # NOTE(beisner) pgrep_full is no longer implemented, as pidof is now
++        # used instead of pgrep.  pgrep_full is still passed through to ensure
++        # deprecation WARNS.  lp1474030
++
++        service_restart = self.service_restarted_since(
++            sentry_unit, mtime,
++            service,
++            pgrep_full=pgrep_full,
++            sleep_time=sleep_time,
++            retry_count=retry_count,
++            retry_sleep_time=retry_sleep_time)
++
++        config_update = self.config_updated_since(
++            sentry_unit,
++            filename,
++            mtime,
++            sleep_time=0)
++
          return service_restart and config_update
      def get_sentry_time(self, sentry_unit):
@@ -428,7 +475,6 @@
          """Return a list of all Ubuntu releases in order of release."""
          _d = distro_info.UbuntuDistroInfo()
          _release_list = _d.all
--        self.log.debug('Ubuntu release list: {}'.format(_release_list))
          return _release_list
      def file_to_url(self, file_rel_path):
@@ -568,6 +614,142 @@
          return None
++    def validate_sectionless_conf(self, file_contents, expected):
++        """A crude conf parser.  Useful to inspect configuration files which
++        do not have section headers (as would be necessary in order to use
++        the configparser).  Such as openstack-dashboard or rabbitmq confs."""
++        for line in file_contents.split('\n'):
++            if '=' in line:
++                args = line.split('=')
++                if len(args) <= 1:
++                    continue
++                key = args[0].strip()
++                value = args[1].strip()
++                if key in expected.keys():
++                    if expected[key] != value:
++                        msg = ('Config mismatch.  Expected, actual:  {}, '
++                               '{}'.format(expected[key], value))
++                        amulet.raise_status(amulet.FAIL, msg=msg)
++
++    def get_unit_hostnames(self, units):
++        """Return a dict of juju unit names to hostnames."""
++        host_names = {}
++        for unit in units:
++            host_names[unit.info['unit_name']] = \
++                str(unit.file_contents('/etc/hostname').strip())
++        self.log.debug('Unit host names: {}'.format(host_names))
++        return host_names
++
++    def run_cmd_unit(self, sentry_unit, cmd):
++        """Run a command on a unit, return the output and exit code."""
++        output, code = sentry_unit.run(cmd)
++        if code == 0:
++            self.log.debug('{} `{}` command returned {} '
++                           '(OK)'.format(sentry_unit.info['unit_name'],
++                                         cmd, code))
++        else:
++            msg = ('{} `{}` command returned {} '
++                   '{}'.format(sentry_unit.info['unit_name'],
++                               cmd, code, output))
++            amulet.raise_status(amulet.FAIL, msg=msg)
++        return str(output), code
++
++    def file_exists_on_unit(self, sentry_unit, file_name):
++        """Check if a file exists on a unit."""
++        try:
++            sentry_unit.file_stat(file_name)
++            return True
++        except IOError:
++            return False
++        except Exception as e:
++            msg = 'Error checking file {}: {}'.format(file_name, e)
++            amulet.raise_status(amulet.FAIL, msg=msg)
++
++    def file_contents_safe(self, sentry_unit, file_name,
++                           max_wait=60, fatal=False):
++        """Get file contents from a sentry unit.  Wrap amulet file_contents
++        with retry logic to address races where a file checks as existing,
++        but no longer exists by the time file_contents is called.
++        Return None if file not found. Optionally raise if fatal is True."""
++        unit_name = sentry_unit.info['unit_name']
++        file_contents = False
++        tries = 0
++        while not file_contents and tries < (max_wait / 4):
++            try:
++                file_contents = sentry_unit.file_contents(file_name)
++            except IOError:
++                self.log.debug('Attempt {} to open file {} from {} '
++                               'failed'.format(tries, file_name,
++                                               unit_name))
++                time.sleep(4)
++                tries += 1
++
++        if file_contents:
++            return file_contents
++        elif not fatal:
++            return None
++        elif fatal:
++            msg = 'Failed to get file contents from unit.'
++            amulet.raise_status(amulet.FAIL, msg)
++
++    def port_knock_tcp(self, host="localhost", port=22, timeout=15):
++        """Open a TCP socket to check for a listening sevice on a host.
++
++        :param host: host name or IP address, default to localhost
++        :param port: TCP port number, default to 22
++        :param timeout: Connect timeout, default to 15 seconds
++        :returns: True if successful, False if connect failed
++        """
++
++        # Resolve host name if possible
++        try:
++            connect_host = socket.gethostbyname(host)
++            host_human = "{} ({})".format(connect_host, host)
++        except socket.error as e:
++            self.log.warn('Unable to resolve address: '
++                          '{} ({}) Trying anyway!'.format(host, e))
++            connect_host = host
++            host_human = connect_host
++
++        # Attempt socket connection
++        try:
++            knock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
++            knock.settimeout(timeout)
++            knock.connect((connect_host, port))
++            knock.close()
++            self.log.debug('Socket connect OK for host '
++                           '{} on port {}.'.format(host_human, port))
++            return True
++        except socket.error as e:
++            self.log.debug('Socket connect FAIL for'
++                           ' {} port {} ({})'.format(host_human, port, e))
++            return False
++
++    def port_knock_units(self, sentry_units, port=22,
++                         timeout=15, expect_success=True):
++        """Open a TCP socket to check for a listening sevice on each
++        listed juju unit.
++
++        :param sentry_units: list of sentry unit pointers
++        :param port: TCP port number, default to 22
++        :param timeout: Connect timeout, default to 15 seconds
++        :expect_success: True by default, set False to invert logic
++        :returns: None if successful, Failure message otherwise
++        """
++        for unit in sentry_units:
++            host = unit.info['public-address']
++            connected = self.port_knock_tcp(host, port, timeout)
++            if not connected and expect_success:
++                return 'Socket connect failed.'
++            elif connected and not expect_success:
++                return 'Socket connected unexpectedly.'
++
++    def get_uuid_epoch_stamp(self):
++        """Returns a stamp string based on uuid4 and epoch time.  Useful in
++        generating test messages which need to be unique-ish."""
++        return '[{}-{}]'.format(uuid.uuid4(), time.time())
++
++# amulet juju action helpers:
      def run_action(self, unit_sentry, action,
                     _check_output=subprocess.check_output):
          """Run the named action on a given unit sentry.
 === modified file 'charmhelpers/contrib/openstack/amulet/deployment.py'
 --- charmhelpers/contrib/openstack/amulet/deployment.py	2015-08-10 19:56:36 +0000
 +++ charmhelpers/contrib/openstack/amulet/deployment.py	2015-09-02 01:36:03 +0000
@@ -44,8 +44,15 @@
             Determine if the local branch being tested is derived from its
             stable or next (dev) branch, and based on this, use the corresonding
             stable or next branches for the other_services."""
++
++        # Charms outside the lp:~openstack-charmers namespace
          base_charms = ['mysql', 'mongodb', 'nrpe']
++        # Force these charms to current series even when using an older series.
++        # ie. Use trusty/nrpe even when series is precise, as the P charm
++        # does not possess the necessary external master config and hooks.
++        force_series_current = ['nrpe']
++
          if self.series in ['precise', 'trusty']:
              base_series = self.series
          else:
@@ -53,11 +60,17 @@
          if self.stable:
              for svc in other_services:
++                if svc['name'] in force_series_current:
++                    base_series = self.current_next
++
                  temp = 'lp:charms/{}/{}'
                  svc['location'] = temp.format(base_series,
                                                svc['name'])
          else:
              for svc in other_services:
++                if svc['name'] in force_series_current:
++                    base_series = self.current_next
++
                  if svc['name'] in base_charms:
                      temp = 'lp:charms/{}/{}'
                      svc['location'] = temp.format(base_series,
@@ -77,21 +90,23 @@
          services = other_services
          services.append(this_service)
++
++        # Charms which should use the source config option
          use_source = ['mysql', 'mongodb', 'rabbitmq-server', 'ceph',
                        'ceph-osd', 'ceph-radosgw']
--        # Most OpenStack subordinate charms do not expose an origin option
--        # as that is controlled by the principle.
--        ignore = ['cinder-ceph', 'hacluster', 'neutron-openvswitch', 'nrpe']
++
++        # Charms which can not use openstack-origin, ie. many subordinates
++        no_origin = ['cinder-ceph', 'hacluster', 'neutron-openvswitch', 'nrpe']
          if self.openstack:
              for svc in services:
--                if svc['name'] not in use_source + ignore:
++                if svc['name'] not in use_source + no_origin:
                      config = {'openstack-origin': self.openstack}
                      self.d.configure(svc['name'], config)
          if self.source:
              for svc in services:
--                if svc['name'] in use_source and svc['name'] not in ignore:
++                if svc['name'] in use_source and svc['name'] not in no_origin:
                      config = {'source': self.source}
                      self.d.configure(svc['name'], config)
 === modified file 'charmhelpers/contrib/openstack/amulet/utils.py'
 --- charmhelpers/contrib/openstack/amulet/utils.py	2015-06-29 13:19:46 +0000
 +++ charmhelpers/contrib/openstack/amulet/utils.py	2015-09-02 01:36:03 +0000
@@ -27,6 +27,7 @@
  import heatclient.v1.client as heat_client
  import keystoneclient.v2_0 as keystone_client
  import novaclient.v1_1.client as nova_client
++import pika
  import swiftclient
  from charmhelpers.contrib.amulet.utils import (
@@ -602,3 +603,361 @@
              self.log.debug('Ceph {} samples (OK): '
                             '{}'.format(sample_type, samples))
              return None
++
++# rabbitmq/amqp specific helpers:
++    def add_rmq_test_user(self, sentry_units,
++                          username="testuser1", password="changeme"):
++        """Add a test user via the first rmq juju unit, check connection as
++        the new user against all sentry units.
++
++        :param sentry_units: list of sentry unit pointers
++        :param username: amqp user name, default to testuser1
++        :param password: amqp user password
++        :returns: None if successful.  Raise on error.
++        """
++        self.log.debug('Adding rmq user ({})...'.format(username))
++
++        # Check that user does not already exist
++        cmd_user_list = 'rabbitmqctl list_users'
++        output, _ = self.run_cmd_unit(sentry_units[0], cmd_user_list)
++        if username in output:
++            self.log.warning('User ({}) already exists, returning '
++                             'gracefully.'.format(username))
++            return
++
++        perms = '".*" ".*" ".*"'
++        cmds = ['rabbitmqctl add_user {} {}'.format(username, password),
++                'rabbitmqctl set_permissions {} {}'.format(username, perms)]
++
++        # Add user via first unit
++        for cmd in cmds:
++            output, _ = self.run_cmd_unit(sentry_units[0], cmd)
++
++        # Check connection against the other sentry_units
++        self.log.debug('Checking user connect against units...')
++        for sentry_unit in sentry_units:
++            connection = self.connect_amqp_by_unit(sentry_unit, ssl=False,
++                                                   username=username,
++                                                   password=password)
++            connection.close()
++
++    def delete_rmq_test_user(self, sentry_units, username="testuser1"):
++        """Delete a rabbitmq user via the first rmq juju unit.
++
++        :param sentry_units: list of sentry unit pointers
++        :param username: amqp user name, default to testuser1
++        :param password: amqp user password
++        :returns: None if successful or no such user.
++        """
++        self.log.debug('Deleting rmq user ({})...'.format(username))
++
++        # Check that the user exists
++        cmd_user_list = 'rabbitmqctl list_users'
++        output, _ = self.run_cmd_unit(sentry_units[0], cmd_user_list)
++
++        if username not in output:
++            self.log.warning('User ({}) does not exist, returning '
++                             'gracefully.'.format(username))
++            return
++
++        # Delete the user
++        cmd_user_del = 'rabbitmqctl delete_user {}'.format(username)
++        output, _ = self.run_cmd_unit(sentry_units[0], cmd_user_del)
++
++    def get_rmq_cluster_status(self, sentry_unit):
++        """Execute rabbitmq cluster status command on a unit and return
++        the full output.
++
++        :param unit: sentry unit
++        :returns: String containing console output of cluster status command
++        """
++        cmd = 'rabbitmqctl cluster_status'
++        output, _ = self.run_cmd_unit(sentry_unit, cmd)
++        self.log.debug('{} cluster_status:\n{}'.format(
++            sentry_unit.info['unit_name'], output))
++        return str(output)
++
++    def get_rmq_cluster_running_nodes(self, sentry_unit):
++        """Parse rabbitmqctl cluster_status output string, return list of
++        running rabbitmq cluster nodes.
++
++        :param unit: sentry unit
++        :returns: List containing node names of running nodes
++        """
++        # NOTE(beisner): rabbitmqctl cluster_status output is not
++        # json-parsable, do string chop foo, then json.loads that.
++        str_stat = self.get_rmq_cluster_status(sentry_unit)
++        if 'running_nodes' in str_stat:
++            pos_start = str_stat.find("{running_nodes,") + 15
++            pos_end = str_stat.find("]},", pos_start) + 1
++            str_run_nodes = str_stat[pos_start:pos_end].replace("'", '"')
++            run_nodes = json.loads(str_run_nodes)
++            return run_nodes
++        else:
++            return []
++
++    def validate_rmq_cluster_running_nodes(self, sentry_units):
++        """Check that all rmq unit hostnames are represented in the
++        cluster_status output of all units.
++
++        :param host_names: dict of juju unit names to host names
++        :param units: list of sentry unit pointers (all rmq units)
++        :returns: None if successful, otherwise return error message
++        """
++        host_names = self.get_unit_hostnames(sentry_units)
++        errors = []
++
++        # Query every unit for cluster_status running nodes
++        for query_unit in sentry_units:
++            query_unit_name = query_unit.info['unit_name']
++            running_nodes = self.get_rmq_cluster_running_nodes(query_unit)
++
++            # Confirm that every unit is represented in the queried unit's
++            # cluster_status running nodes output.
++            for validate_unit in sentry_units:
++                val_host_name = host_names[validate_unit.info['unit_name']]
++                val_node_name = 'rabbit@{}'.format(val_host_name)
++
++                if val_node_name not in running_nodes:
++                    errors.append('Cluster member check failed on {}: {} not '
++                                  'in {}\n'.format(query_unit_name,
++                                                   val_node_name,
++                                                   running_nodes))
++        if errors:
++            return ''.join(errors)
++
++    def rmq_ssl_is_enabled_on_unit(self, sentry_unit, port=None):
++        """Check a single juju rmq unit for ssl and port in the config file."""
++        host = sentry_unit.info['public-address']
++        unit_name = sentry_unit.info['unit_name']
++
++        conf_file = '/etc/rabbitmq/rabbitmq.config'
++        conf_contents = str(self.file_contents_safe(sentry_unit,
++                                                    conf_file, max_wait=16))
++        # Checks
++        conf_ssl = 'ssl' in conf_contents
++        conf_port = str(port) in conf_contents
++
++        # Port explicitly checked in config
++        if port and conf_port and conf_ssl:
++            self.log.debug('SSL is enabled  @{}:{} '
++                           '({})'.format(host, port, unit_name))
++            return True
++        elif port and not conf_port and conf_ssl:
++            self.log.debug('SSL is enabled @{} but not on port {} '
++                           '({})'.format(host, port, unit_name))
++            return False
++        # Port not checked (useful when checking that ssl is disabled)
++        elif not port and conf_ssl:
++            self.log.debug('SSL is enabled  @{}:{} '
++                           '({})'.format(host, port, unit_name))
++            return True
++        elif not port and not conf_ssl:
++            self.log.debug('SSL not enabled @{}:{} '
++                           '({})'.format(host, port, unit_name))
++            return False
++        else:
++            msg = ('Unknown condition when checking SSL status @{}:{} '
++                   '({})'.format(host, port, unit_name))
++            amulet.raise_status(amulet.FAIL, msg)
++
++    def validate_rmq_ssl_enabled_units(self, sentry_units, port=None):
++        """Check that ssl is enabled on rmq juju sentry units.
++
++        :param sentry_units: list of all rmq sentry units
++        :param port: optional ssl port override to validate
++        :returns: None if successful, otherwise return error message
++        """
++        for sentry_unit in sentry_units:
++            if not self.rmq_ssl_is_enabled_on_unit(sentry_unit, port=port):
++                return ('Unexpected condition:  ssl is disabled on unit '
++                        '({})'.format(sentry_unit.info['unit_name']))
++        return None
++
++    def validate_rmq_ssl_disabled_units(self, sentry_units):
++        """Check that ssl is enabled on listed rmq juju sentry units.
++
++        :param sentry_units: list of all rmq sentry units
++        :returns: True if successful.  Raise on error.
++        """
++        for sentry_unit in sentry_units:
++            if self.rmq_ssl_is_enabled_on_unit(sentry_unit):
++                return ('Unexpected condition:  ssl is enabled on unit '
++                        '({})'.format(sentry_unit.info['unit_name']))
++        return None
++
++    def configure_rmq_ssl_on(self, sentry_units, deployment,
++                             port=None, max_wait=60):
++        """Turn ssl charm config option on, with optional non-default
++        ssl port specification.  Confirm that it is enabled on every
++        unit.
++
++        :param sentry_units: list of sentry units
++        :param deployment: amulet deployment object pointer
++        :param port: amqp port, use defaults if None
++        :param max_wait: maximum time to wait in seconds to confirm
++        :returns: None if successful.  Raise on error.
++        """
++        self.log.debug('Setting ssl charm config option:  on')
++
++        # Enable RMQ SSL
++        config = {'ssl': 'on'}
++        if port:
++            config['ssl_port'] = port
++
++        deployment.configure('rabbitmq-server', config)
++
++        # Confirm
++        tries = 0
++        ret = self.validate_rmq_ssl_enabled_units(sentry_units, port=port)
++        while ret and tries < (max_wait / 4):
++            time.sleep(4)
++            self.log.debug('Attempt {}: {}'.format(tries, ret))
++            ret = self.validate_rmq_ssl_enabled_units(sentry_units, port=port)
++            tries += 1
++
++        if ret:
++            amulet.raise_status(amulet.FAIL, ret)
++
++    def configure_rmq_ssl_off(self, sentry_units, deployment, max_wait=60):
++        """Turn ssl charm config option off, confirm that it is disabled
++        on every unit.
++
++        :param sentry_units: list of sentry units
++        :param deployment: amulet deployment object pointer
++        :param max_wait: maximum time to wait in seconds to confirm
++        :returns: None if successful.  Raise on error.
++        """
++        self.log.debug('Setting ssl charm config option:  off')
++
++        # Disable RMQ SSL
++        config = {'ssl': 'off'}
++        deployment.configure('rabbitmq-server', config)
++
++        # Confirm
++        tries = 0
++        ret = self.validate_rmq_ssl_disabled_units(sentry_units)
++        while ret and tries < (max_wait / 4):
++            time.sleep(4)
++            self.log.debug('Attempt {}: {}'.format(tries, ret))
++            ret = self.validate_rmq_ssl_disabled_units(sentry_units)
++            tries += 1
++
++        if ret:
++            amulet.raise_status(amulet.FAIL, ret)
++
++    def connect_amqp_by_unit(self, sentry_unit, ssl=False,
++                             port=None, fatal=True,
++                             username="testuser1", password="changeme"):
++        """Establish and return a pika amqp connection to the rabbitmq service
++        running on a rmq juju unit.
++
++        :param sentry_unit: sentry unit pointer
++        :param ssl: boolean, default to False
++        :param port: amqp port, use defaults if None
++        :param fatal: boolean, default to True (raises on connect error)
++        :param username: amqp user name, default to testuser1
++        :param password: amqp user password
++        :returns: pika amqp connection pointer or None if failed and non-fatal
++        """
++        host = sentry_unit.info['public-address']
++        unit_name = sentry_unit.info['unit_name']
++
++        # Default port logic if port is not specified
++        if ssl and not port:
++            port = 5671
++        elif not ssl and not port:
++            port = 5672
++
++        self.log.debug('Connecting to amqp on {}:{} ({}) as '
++                       '{}...'.format(host, port, unit_name, username))
++
++        try:
++            credentials = pika.PlainCredentials(username, password)
++            parameters = pika.ConnectionParameters(host=host, port=port,
++                                                   credentials=credentials,
++                                                   ssl=ssl,
++                                                   connection_attempts=3,
++                                                   retry_delay=5,
++                                                   socket_timeout=1)
++            connection = pika.BlockingConnection(parameters)
++            assert connection.server_properties['product'] == 'RabbitMQ'
++            self.log.debug('Connect OK')
++            return connection
++        except Exception as e:
++            msg = ('amqp connection failed to {}:{} as '
++                   '{} ({})'.format(host, port, username, str(e)))
++            if fatal:
++                amulet.raise_status(amulet.FAIL, msg)
++            else:
++                self.log.warn(msg)
++                return None
++
++    def publish_amqp_message_by_unit(self, sentry_unit, message,
++                                     queue="test", ssl=False,
++                                     username="testuser1",
++                                     password="changeme",
++                                     port=None):
++        """Publish an amqp message to a rmq juju unit.
++
++        :param sentry_unit: sentry unit pointer
++        :param message: amqp message string
++        :param queue: message queue, default to test
++        :param username: amqp user name, default to testuser1
++        :param password: amqp user password
++        :param ssl: boolean, default to False
++        :param port: amqp port, use defaults if None
++        :returns: None.  Raises exception if publish failed.
++        """
++        self.log.debug('Publishing message to {} queue:\n{}'.format(queue,
++                                                                    message))
++        connection = self.connect_amqp_by_unit(sentry_unit, ssl=ssl,
++                                               port=port,
++                                               username=username,
++                                               password=password)
++
++        # NOTE(beisner): extra debug here re: pika hang potential:
++        #   https://github.com/pika/pika/issues/297
++        #   https://groups.google.com/forum/#!topic/rabbitmq-users/Ja0iyfF0Szw
++        self.log.debug('Defining channel...')
++        channel = connection.channel()
++        self.log.debug('Declaring queue...')
++        channel.queue_declare(queue=queue, auto_delete=False, durable=True)
++        self.log.debug('Publishing message...')
++        channel.basic_publish(exchange='', routing_key=queue, body=message)
++        self.log.debug('Closing channel...')
++        channel.close()
++        self.log.debug('Closing connection...')
++        connection.close()
++
++    def get_amqp_message_by_unit(self, sentry_unit, queue="test",
++                                 username="testuser1",
++                                 password="changeme",
++                                 ssl=False, port=None):
++        """Get an amqp message from a rmq juju unit.
++
++        :param sentry_unit: sentry unit pointer
++        :param queue: message queue, default to test
++        :param username: amqp user name, default to testuser1
++        :param password: amqp user password
++        :param ssl: boolean, default to False
++        :param port: amqp port, use defaults if None
++        :returns: amqp message body as string.  Raise if get fails.
++        """
++        connection = self.connect_amqp_by_unit(sentry_unit, ssl=ssl,
++                                               port=port,
++                                               username=username,
++                                               password=password)
++        channel = connection.channel()
++        method_frame, _, body = channel.basic_get(queue)
++
++        if method_frame:
++            self.log.debug('Retreived message from {} queue:\n{}'.format(queue,
++                                                                         body))
++            channel.basic_ack(method_frame.delivery_tag)
++            channel.close()
++            connection.close()
++            return body
++        else:
++            msg = 'No message retrieved.'
++            amulet.raise_status(amulet.FAIL, msg)

Charm Helpers

Merge lp:~1chb1n/charm-helpers/amulet-rmq-helpers into lp:charm-helpers

Commit message

Description of the change

Preview Diff

Subscribers