juju-ci-tools

Merge lp:~viswesn/juju-ci-tools/ensure_provider_cleanup into lp:juju-ci-tools

ensure_provider_cleanup
Merge into trunk

Proposed by viswesuwara nathan on 2017-01-24

Status:	Merged
Merged at revision:	1967
Proposed branch:	lp:~viswesn/juju-ci-tools/ensure_provider_cleanup
Merge into:	lp:juju-ci-tools
Prerequisite:	lp:~viswesn/juju-ci-tools/juju-ci-cleanup
Diff against target:	508 lines (+434/-3) 4 files modified deploy_stack.py (+18/-1) substrate.py (+96/-2) tests/test_deploy_stack.py (+58/-0) tests/test_substrate.py (+262/-0)
To merge this branch:	bzr merge lp:~viswesn/juju-ci-tools/ensure_provider_cleanup
Related bugs:	Link a bug report

Reviewer	Date Requested	Status
Aaron Bentley (community)		Approve on 2017-03-27
Christopher Lee (community)	2017-01-24	Needs Fixing on 2017-03-27
Review via email: mp+315456@code.launchpad.net

Description of the change

Do ensure cleanup for each of the provider and make sure we need to come with the list of resources that were not cleanup during cleanup activity.

Revision history for this message

Christopher Lee (veebers) wrote on 2017-01-27:

I don't think that adding unclean=[] to terminate_instances is the best way to do this. Instead you should make the call to terminate_instances and catch any exception that might occur and log that.
Changing terminate_instances like you have drastically changes how the method responds to error (it use to raise an exception in some places, now it errors silently) which might have a major effect on other parts of existing code that rely on the original behaviour.

Also of note, I don't think having an empty list as a default arg like that is what you want, you might not be aware but it's behaviour might be surprising.
This link can describe it better than I can: http://docs.python-guide.org/en/latest/writing/gotchas/#mutable-default-arguments

Please also see inline comments too.
I plan to get some other eyes on this in case I miss something.

review: Needs Fixing

lp:~viswesn/juju-ci-tools/ensure_provider_cleanup updated on 2017-01-30

1833. By viswesuwara nathan on 2017-01-30: review comments addressed

Revision history for this message

viswesuwara nathan (viswesn) wrote on 2017-01-31:

Most of the code remains same for ensure_cleanup at this point in time which invokes terminate the instance. The other cleanup activity specific to security group, firewall, etc., were not addressed here at this point in time because that requires contribution from the providers before supporting invoking them in ensure_cleanup.

Revision history for this message

ABHAY (ankatare) wrote on 2017-02-01:

looks good to me, only one phrase correction required.

Revision history for this message

Aaron Bentley (abentley) wrote on 2017-02-02:

This branch introduces a bunch of duplicate code. Please avoid that.

I think you probably want an Account-specific list_dirty method and then a non-specific clean_dirty method which operates on that list. If a substrate doesn't support e.g. delete_detached_interfaces, then list_dirty should not return the dirty interfaces.

Whatever you do, don't repeat yourself. Factor out the duplicate code.

Don't catch Exception. Catch the specific exceptions that you expect to be raised.

When you do catch an exception, don't make it impossible to see what the exception was. We need to know *why* it's failing in order to improve the script in the future. You could print it. You could append a tuple of (resource, exception) to unclean_resources.

Are you certain that this script is not going to delete freshly-created resources? For example, could non_instance_groups contain a group created by a different script that is about to be associated to an instance? Please explain why it's safe to delete specific resource types in the docstring or comments.

You are providing a string to terminate_instances. I don't expect this to work; you should be providing a list of strings.

review: Needs Fixing

Revision history for this message

viswesuwara nathan (viswesn) wrote on 2017-02-03:

>> Whatever you do, don't repeat yourself. Factor out the duplicate code.
<<Viswesn>> Currently we see most of the code is duplicated but then once we have list of resources to be cleaned up in coming days for each substrate then ensure_cleanup is going to be different for each substrate.

>> Don't catch Exception. Catch the specific exceptions that you expect to be raised.
<<Viswesn>> Now I need to address this by sending the unclean=[] to the callee function and then append the resource name + exception something like ["instance_id", "instance-001", "Interface failed"]

------------------------------
delete_detached_interfaces(self, security_group, unclean=None)
    if unclean is None:
          unclean=[]
    ...
    ....
     except Exception as ex:
          unclean.append(["instance_id", instance_name, type(ex).__name__, ex.args])
------------------------------

Chris, let me know your comments on this. We decided not to do so in our last call but then to catch the exact exception then I need to pass unclean to the callee method rather than having like try..catch.. statement on invoking terminate_instance.

>> You are providing a string to terminate_instances. I don't expect this to work;
<<Viswesn>> Yes, I agree and I will address this issue.

>> Are you certain that this script is not going to delete freshly-created resources?
<<Viswesn>> if ensure_cleanup has visibility to those freshly-created resource then yes but seems like it is not at present and it is up the substrate developer to bring this in to picture as part of ensure_cleanup.

I will start working on this code after getting comments from Chirs and Aaron.
Please let me know on this. Thanks.

Revision history for this message

Aaron Bentley (abentley) wrote on 2017-02-03:

> >> Whatever you do, don't repeat yourself. Factor out the duplicate code.
> <<Viswesn>> Currently we see most of the code is duplicated but then once we
> have list of resources to be cleaned up in coming days for each substrate then
> ensure_cleanup is going to be different for each substrate.

The parts that are the same, like terminate_instances, should not be duplicated. Factor them out.

> >> Don't catch Exception. Catch the specific exceptions that you expect to be
> raised.
> <<Viswesn>> Now I need to address this by sending the unclean=[] to the callee
> function and then append the resource name + exception something like
> ["instance_id", "instance-001", "Interface failed"]
>
> ------------------------------
> delete_detached_interfaces(self, security_group, unclean=None)
> if unclean is None:
> unclean=[]
> ...
> ....
> except Exception as ex:
> unclean.append(["instance_id", instance_name, type(ex).__name__,
> ex.args])
> ------------------------------
>
> Chris, let me know your comments on this. We decided not to do so in our last
> call but then to catch the exact exception then I need to pass unclean to the
> callee method rather than having like try..catch.. statement on invoking
> terminate_instance.

I don't understand why catching a specific exception would require you to pass unclean to the callee method.

> >> Are you certain that this script is not going to delete freshly-created
> resources?
> <<Viswesn>> if ensure_cleanup has visibility to those freshly-created resource
> then yes but seems like it is not at present and it is up the substrate
> developer to bring this in to picture as part of ensure_cleanup.

ensure_cleanup has visibility to those freshly-created resources. It is your problem, not the substrate developer's.

Revision history for this message

Aaron Bentley (abentley) wrote on 2017-02-03:

I have had a chance to look at the design doc: https://docs.google.com/document/d/1vi94tBGWf5UiWIIoer5Hn8Un9u7PqauX8fnF2iNIQpA/edit#

This feature has two intents:
1. Ensure resources are cleanup so that we have uninterrupted testing
and do not accrue extra costs
2. We want to report a failure when destroy-controller fails to
cleanup.

The way security groups and interfaces are being handled does not meet goal 2. It cannot report that destroy-controller failed to clean up for this test run, because it does not know that the security groups were created by this test run.

The doc even suggests "Match security groups/firewalls by controller uuid in the name or in tags/metadata".

Since the doc was written, goal 2 has become more important than goal 1, because our existing cleanup scripts are doing a good job and juju itself is better-behaved.

review: Needs Fixing

lp:~viswesn/juju-ci-tools/ensure_provider_cleanup updated on 2017-02-28

1834. By viswesuwara nathan on 2017-02-27: Code updated
1835. By viswesuwara nathan on 2017-02-28: Review comments addressed

Revision history for this message

Christopher Lee (veebers) wrote on 2017-03-01:

A couple of times in discussion I'm stated that we'll concentrate only on AWS for this MP and then do the other providers in follow up MPs.
This MP alters more than just the AWSAccount class, a couple of issues with this:

- This makes the MP even longer. The diff is 548 lines long, this is too long to comfortably review. As mentioned multiple times before we want to have short MPs (less than 300 lines) this makes reviews quicker, easier and less prone to errors.

- There are no tests included for the altered *Account classes. There is no assurance that this branch hasn't broken anything.

- Due to the first points, adding the needed tests would blow the diff count way out.

make lint fails.

Please notice the comments about methods that could be made functions, having functions makes testing a lot easier (no need to create or mock state).

review: Needs Fixing

lp:~viswesn/juju-ci-tools/ensure_provider_cleanup updated on 2017-03-02

1836. By viswesuwara nathan on 2017-03-02: Review comments addressed
1837. By viswesuwara nathan on 2017-03-02: Review comments addressed

Revision history for this message

viswesuwara nathan (viswesn) wrote on 2017-03-02:

This branch is created long back for juju-ci-cleanup enhancement work and already we integrated the changes specific to most of the cloud providers and that is the reason for seeing code changes for providers like OpenStack, GCE, etc. As discussed in the call today I kept changes only for AWS and other provider changes were removed.

Please review it and let me know your comments on the updated code base. Thanks

Revision history for this message

Christopher Lee (veebers) wrote on 2017-03-03:

Some further queries about the data-structures used for storing errors.
I expanded on your question a bit, let me know what you think or if you would like more info.

Some test issues too. Make sure you're mocking the right thing, it's all too easy when mocking to produce false positives in your code that don't match reality when actually being run.

review: Needs Fixing

Revision history for this message

viswesuwara nathan (viswesn) wrote on 2017-03-03:

Chris, I updated the code by taken up almost all the review comments provided and more over in the current code review I removed the test case "TestGetSecurityGroups" because I need to discuss with you on how to handle "get_all_security_groups" using Mock.

Let us discuss on this on coming Monday call or please let me know by mail how to address "get_all_security_groups" mock in returning the value. It also requires invoking sg.instance() based on the object that it returns. Thanks

lp:~viswesn/juju-ci-tools/ensure_provider_cleanup updated on 2017-03-04

1838. By viswesuwara nathan on 2017-03-03: Review comments addressed
1839. By viswesuwara nathan on 2017-03-04: Review comments updated

Revision history for this message

Christopher Lee (veebers) wrote on 2017-03-14:

Hi Vishwa,

Lets talk in our next call re: your query for TestGetSecurityGroups.

Please see comments inline.

review: Needs Fixing

Revision history for this message

viswesuwara nathan (viswesn) wrote on 2017-03-14:

Hi Chris,
I updated the review comments and let us discuss more on TestGetSecurityGroups in call. Thanks

lp:~viswesn/juju-ci-tools/ensure_provider_cleanup updated on 2017-03-15

1840. By viswesuwara nathan on 2017-03-14: Review comments addressed
1841. By viswesuwara nathan on 2017-03-15: GetSecurityGroups changes
1842. By viswesuwara nathan on 2017-03-15: GetSecurityGroups changes

Revision history for this message

Aaron Bentley (abentley) wrote on 2017-03-15:

attempt_terminate_instances does not work on AWS.

The tests mock out the direct callees, even though there are alternatives. Once consequence is that you didn't notice that attempt_terminate_instances does not work on AWS.

The tests use assertEqual(0, mock.call_count) where they could use mock.assert_called_once_with.

Also, I don't think it makes sense to reimplement set.issubset as contains_only_known_instances.

There are some formatting suggestions, too. See below.

Also, please merge trunk more frequently. I got merge conflicts with one algorithm.

review: Needs Fixing

Revision history for this message

Aaron Bentley (abentley) wrote on 2017-03-15:

The failure:
http://pastebin.ubuntu.com/24183061/

lp:~viswesn/juju-ci-tools/ensure_provider_cleanup updated on 2017-03-16

1843. By viswesuwara nathan on 2017-03-16: Review comments addressed
1844. By viswesuwara nathan on 2017-03-16: Merge to trunk

Revision history for this message

viswesuwara nathan (viswesn) wrote on 2017-03-16:

1. Most of the review comments were addressed.
2. Code merged with trunk
3. Hitting one failure on having patch for terminate_instances - http://paste.ubuntu.com/24188081/
Please let me know how to resolve this error.
4. Update the comments section on addressing it.

Revision history for this message

Christopher Lee (veebers) wrote on 2017-03-17:

responded to comments.

Revision history for this message

Christopher Lee (veebers) wrote on 2017-03-17:

clarified some points and pointed out some test issues.

review: Needs Fixing

lp:~viswesn/juju-ci-tools/ensure_provider_cleanup updated on 2017-03-20

1845. By viswesuwara nathan on 2017-03-20: Review comments addressed

Revision history for this message

Christopher Lee (veebers) wrote on 2017-03-21:

You haven't answered the questions or given reason for the changes that differ to the requests from the previous review (namely repr vs str and the ability to store better details on why the delete failed). Please respond.

Also you seemed to have ignored the comments surrounding contains_only_known_instances (using issubset and removing the method). Please respond.

I've identified concerns with the testing below, namely patching the wrong then and incorrect asserts.

review: Needs Fixing

Revision history for this message

viswesuwara nathan (viswesn) wrote on 2017-03-21:

I used repr instead of str because when I read couple of articles in the stackoverflow they recommended to use repr than str hence I decided to go for repr than str.

Now coming to use of issubset; I already asked this on March 16th; Please find the same here

"Getting error as - AttributeError: 'list' object has no attribute 'issubset'
Do I miss something? Please guide me on this.

I also addressed the comments given and I too asked questions on few of them. Please
help me in answering those questions. Thanks

lp:~viswesn/juju-ci-tools/ensure_provider_cleanup updated on 2017-03-21

1846. By viswesuwara nathan on 2017-03-21: Review comments addressed

Revision history for this message

Christopher Lee (veebers) wrote on 2017-03-22:

Hi Vishwa,

In the future it would be helpful to mention why you made a change different to what was suggested, it removes any need for guessing and/or assuming that there was a misunderstanding.
Also, please state the actual reason for the decision (i.e. x is better because of y, not someone else said to do it this way).

Regarding your question re: issubset, please make sure you read all the comments as I have already answered this. On March 17th I state:
"""
You're missing that possibly_known_ids is a list object and doesn't have the method issubset. You need a set object for that.
The use of issubset was also mentioned in a previous comment.

Aarons suggestion here is also to get rid of the function and just make the call where it's used. With reducing the call down to use issubset this makes sense.
Originally I suggested to have a separate function to ease testing and readability but this has evolved to the point where that's not needed.
"""

I have answered your question in line (sorry for the wall of text).
Please see here: http://paste.ubuntu.com/24226385/ to see an example of changing the test to follow the suggestions I made. Not this test now uncovers an error in the code.
Hint: it's related to the comment about what get_security_groups actually returns.

review: Needs Fixing

Revision history for this message

Christopher Lee (veebers) wrote on 2017-03-22:

Updated comment.

Revision history for this message

viswesuwara nathan (viswesn) wrote on 2017-03-23:

Hi All,

>> In the future it would be helpful to mention why you made a change different to what was suggested, it removes any need for guessing and/or assuming that there was a misunderstanding.

<<Viswa>> I definitely do this And I used repr instead of str on reading from link
http://stackoverflow.com/questions/4308182/getting-the-exception-value-in-python

>> Regarding your question re: issubset
<<Viswa>> The function is now modified and it is now single line; I didn't integrated the same in the caller and kept the function contains_only_known_instances as it is.

Major changes in the test code were done to make Mock only for lower level function (client) as discussed and mock were not introduced for the functions that were introduced by us.

lp:~viswesn/juju-ci-tools/ensure_provider_cleanup updated on 2017-03-24

1847. By viswesuwara nathan on 2017-03-24: Code review comments addressed

Revision history for this message

Christopher Lee (veebers) wrote on 2017-03-27:

One fix, one question that needs answered.

review: Needs Fixing

lp:~viswesn/juju-ci-tools/ensure_provider_cleanup updated on 2017-03-27

1848. By viswesuwara nathan on 2017-03-27: review comments addressed

Revision history for this message

viswesuwara nathan (viswesn) wrote on 2017-03-27:

Fixed and provided answer to your question. Thanks

Revision history for this message

Aaron Bentley (abentley) wrote on 2017-03-27:

Some minor issues mentioned inline. Otherwise, I think this is good to merge.

review: Approve

lp:~viswesn/juju-ci-tools/ensure_provider_cleanup updated on 2017-03-29

1849. By viswesuwara nathan on 2017-03-29: Review comments addressed
1850. By viswesuwara nathan on 2017-03-29: fixed lint errors

Revision history for this message

viswesuwara nathan (viswesn) wrote on 2017-03-29:

Fixed all the review comments.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Juju Release Engineering

viswesuwara nathan

 === modified file 'deploy_stack.py'
 --- deploy_stack.py	2017-03-01 20:34:01 +0000
 +++ deploy_stack.py	2017-03-29 17:54:35 +0000
@@ -518,6 +518,22 @@
              os.environ['SSO_PASSWORD'], client, tear_down_client)
++def error_if_unclean(unclean_resources):
++    """List all the resource that were not cleaned up programmatically.
++
++    :param unclean_resources: List of unclean resources
++    """
++    if unclean_resources:
++        logging.critical("Following resource requires manual cleanup")
++        for resources in unclean_resources:
++            resource = resources.get("resource")
++            logging.critical(resource)
++            errors = resources.get("errors")
++            for (id, reason) in errors:
++                reason_string = "\t{}: {}".format(id, reason)
++                logging.critical(reason_string)
++
++
  class CreateController:
      """A Controller strategy where the controller is created.
@@ -920,7 +936,8 @@
                          if self.has_controller:
                              self.collect_resource_details()
                          self.tear_down(self.jes_enabled)
--                        self.ensure_cleanup()
++                        unclean_resources = self.ensure_cleanup()
++                        error_if_unclean(unclean_resources)
      # GZ 2016-08-11: Should this method be elsewhere to avoid poking backend?
      def _should_dump(self):
 === modified file 'substrate.py'
 --- substrate.py	2017-02-13 15:53:00 +0000
 +++ substrate.py	2017-03-29 17:54:35 +0000
@@ -76,6 +76,37 @@
      subprocess.check_call(command_args, env=environ)
++def attempt_terminate_instances(account, instance_ids):
++    """Initiate terminate instance method of specific handler
++
++    :param account: Substrate account object.
++    :param instance_ids: List of instance_ids to terminate
++    :return: List of instance_ids failed to terminate
++    """
++    uncleaned_instances = []
++    for instance_id in instance_ids:
++        try:
++            # We are calling terminate instances for each instances
++            # individually so as to catch any error.
++            account.terminate_instances([instance_id])
++        except Exception as e:
++            # Using too broad exception here because terminate_instances method
++            # is handlers specific
++            uncleaned_instances.append((instance_id, repr(e)))
++    return uncleaned_instances
++
++
++def contains_only_known_instances(known_instance_ids, possibly_known_ids):
++    """Identify instance_id_list only contains ids we know about.
++
++    :param known_instance_ids: The list of instance_ids (superset)
++    :param possibly_known_ids: The list of instance_ids (subset)
++    :return: True if known_instance_ids only contains
++    possibly_known_ids
++    """
++    return set(possibly_known_ids).issubset(set(known_instance_ids))
++
++
  class AWSAccount:
      """Represent the credentials of an AWS account."""
@@ -160,14 +191,77 @@
                      break
          return unclean
++    def cleanup_security_groups(self, instances, secgroups):
++        """Destroy any security groups used only by `instances`.
++
++        :param instances: The list of instance_ids
++        :param secgroups: dict of security groups
++        :return: list of failed deleted security groups
++        """
++        failures = []
++        for sg_id, sg_instances in secgroups:
++            if contains_only_known_instances(instances, sg_instances):
++                try:
++                    deleted = self.client.delete_security_group(name=sg_id)
++                    if not deleted:
++                        failures.append((sg_id, "Failed to delete"))
++                except EC2ResponseError as e:
++                    failures.append((sg_id, repr(e)))
++
++        return failures
++
++    def get_security_groups(self, instances):
++        """Get AWS configured security group
++        If instances list is specified then get security groups mapped
++        to those instances only.
++
++        :param instances: list of instance names
++        :return: list containing tuples; where each tuples contains security
++        group id as first element and the list of instances mapped to that
++        security group as second element. [(sg_id, [i_id, id2]),
++         (sg_id2, [i_id1])]
++        """
++        group_ids = [sg[0] for sg in self.iter_instance_security_groups(
++            instances)]
++        all_groups = self.client.get_all_security_groups(
++            group_ids=group_ids)
++        secgroups = [(sg.id, [id for id in sg.instances()])
++                     for sg in all_groups]
++        return secgroups
++
++    def terminate_instances(self, instance_ids):
++        """Terminate the specified instances."""
++        return self.client.terminate_instances(instance_ids=instance_ids)
++
      def ensure_cleanup(self, resource_details):
          """
          Do AWS specific clean-up activity.
          :param resource_details: The list of resource to be cleaned up
          :return: list of resources that were not cleaned up
          """
--        uncleaned_resource = []
--        return uncleaned_resource
++        uncleaned_resources = []
++
++        if not resource_details:
++            return uncleaned_resources
++
++        security_groups = self.get_security_groups(
++            resource_details.get('instances', []))
++
++        uncleaned_instances = attempt_terminate_instances(
++            self, resource_details.get('instances', []))
++
++        uncleaned_security_groups = self.cleanup_security_groups(
++            resource_details.get('instances', []), security_groups)
++
++        if uncleaned_instances:
++            uncleaned_resources.append(
++                {'resource': 'instances',
++                 'errors': uncleaned_instances})
++        if uncleaned_security_groups:
++            uncleaned_resources.append(
++                {'resource': 'security groups',
++                 'errors': uncleaned_security_groups})
++        return uncleaned_resources
  class OpenStackAccount:
 === modified file 'tests/test_deploy_stack.py'
 --- tests/test_deploy_stack.py	2017-03-10 00:50:40 +0000
 +++ tests/test_deploy_stack.py	2017-03-29 17:54:35 +0000
@@ -51,6 +51,7 @@
      retain_config,
      update_env,
      wait_for_state_server_to_shutdown,
++    error_if_unclean,
+     )
  from jujupy import (
      EnvJujuClient1X,
@@ -2673,3 +2674,60 @@
          wfp_mock.assert_called_once_with('example.org', 17070, closed=True,
                                           timeout=60)
          hni_mock.assert_called_once_with(client.env, 'i-255')
++
++
++class TestErrorIfUnclean(FakeHomeTestCase):
++    def test_empty_unclean_resources(self):
++        uncleaned_resources = []
++        error_if_unclean(uncleaned_resources)
++        self.assertEquals(self.log_stream.getvalue(), '')
++
++    def test_contain_unclean_resources(self):
++        uncleaned_resources = [
++                {
++                    'resource': 'instances',
++                    'errors': [('ifoo', 'err-msg'), ('ibar', 'err-msg')]
++                },
++                {
++                    'resource': 'security groups',
++                    'errors': [('sg-bar', 'err-msg')]
++                }
++            ]
++        error_if_unclean(uncleaned_resources)
++        self.assertListEqual(self.log_stream.getvalue().splitlines(), [
++            "CRITICAL Following resource requires manual cleanup",
++            "CRITICAL instances",
++            "CRITICAL \tifoo: err-msg",
++            "CRITICAL \tibar: err-msg",
++            "CRITICAL security groups",
++            "CRITICAL \tsg-bar: err-msg"
++        ])
++
++    def test_unclean_resources_without_sg_error(self):
++        uncleaned_resources = [
++                {
++                    'resource': 'instances',
++                    'errors': [('ifoo', 'err-msg'), ('ibar', 'err-msg')]
++                },
++        ]
++        error_if_unclean(uncleaned_resources)
++        self.assertListEqual(self.log_stream.getvalue().splitlines(), [
++            "CRITICAL Following resource requires manual cleanup",
++            "CRITICAL instances",
++            "CRITICAL \tifoo: err-msg",
++            "CRITICAL \tibar: err-msg",
++        ])
++
++    def test_unclean_resources_without_instances_error(self):
++        uncleaned_resources = [
++                {
++                    'resource': 'security groups',
++                    'errors': [('sg-bar', 'err-msg')]
++                }
++            ]
++        error_if_unclean(uncleaned_resources)
++        self.assertListEqual(self.log_stream.getvalue().splitlines(), [
++            "CRITICAL Following resource requires manual cleanup",
++            "CRITICAL security groups",
++            "CRITICAL \tsg-bar: err-msg"
++        ])
 === modified file 'tests/test_substrate.py'
 --- tests/test_substrate.py	2017-02-13 15:53:00 +0000
 +++ tests/test_substrate.py	2017-03-29 17:54:35 +0000
@@ -51,6 +51,8 @@
      stop_libvirt_domain,
      terminate_instances,
      verify_libvirt_domain,
++    contains_only_known_instances,
++    attempt_terminate_instances,
+     )
  from tests import (
      FakeHomeTestCase,
@@ -1677,3 +1679,263 @@
      def test_maas_ensure_cleanup(self):
          substrate_account = MAASAccount('profile', 'url', 'oauth')
          self.assertEqual([], substrate_account.ensure_cleanup([]))
++
++
++class FakeSecurityGroup:
++    def __init__(self, id, instances):
++        self.id = id
++        self._instances = instances
++
++    def instances(self):
++        return self._instances
++
++
++class TestAWSEnsureCleanUp(TestCase):
++    def test_ensure_cleanup_successfully(self):
++        client = MagicMock()
++        resource_details = dict()
++        resource_details['instances'] = ["i_id1", "i_id2"]
++        aws = AWSAccount(None, 'myregion', client)
++        client.get_all_security_groups.return_value = [
++            FakeSecurityGroup('sg_id1', ['i_id1', 'i_id2'])]
++        client.delete_security_group.return_value = True
++        uncleaned_resources = aws.ensure_cleanup(resource_details)
++        client.delete_security_group.assert_called_once_with(name='sg_id1')
++        self.assertEqual(client.get_all_instances.call_args_list,
++                         [call(instance_ids=['i_id1', 'i_id2'])])
++        self.assertEqual(uncleaned_resources, [])
++        self.assertEqual(
++            aws.client.terminate_instances.call_args_list,
++            [call(instance_ids=['i_id1']), call(instance_ids=['i_id2'])])
++
++    def test_ensure_cleanup_with_uncleaned_instances(self):
++        client = MagicMock()
++        resource_details = dict()
++        resource_details['instances'] = ["i_id1", "i_id2"]
++        aws = AWSAccount(None, 'myregion', client)
++        err_msg = 'Instance error'
++        client.terminate_instances.side_effect = [
++            Exception(err_msg), Exception(err_msg)]
++        client.get_all_security_groups.return_value = [
++            FakeSecurityGroup('sg_id1', ['i_id1', 'i_id2'])]
++        client.delete_security_group.return_value = True
++        uncleaned_resources = aws.ensure_cleanup(resource_details)
++        self.assertEqual(client.get_all_instances.call_args_list,
++                         [call(instance_ids=['i_id1', 'i_id2'])])
++        self.assertEqual(uncleaned_resources, [
++            {'errors': [('i_id1', "Exception('Instance error',)"),
++                        ('i_id2', "Exception('Instance error',)")],
++             'resource': 'instances'}])
++
++    def test_ensure_cleanup_with_uncleaned_sg(self):
++        client = MagicMock()
++        resource_details = dict()
++        resource_details['instances'] = ["i_id1", "i_id2"]
++        aws = AWSAccount(None, 'myregion', client)
++        client.terminate_instances.side_effect = ["i_id1", "i_id2"]
++        client.get_all_security_groups.return_value = [
++            FakeSecurityGroup('sg_id1', [])]
++        client.delete_security_group.return_value = False
++        uncleaned_resources = aws.ensure_cleanup(resource_details)
++        self.assertEqual(uncleaned_resources, [
++            {'errors': [('sg_id1', 'Failed to delete')],
++             'resource': 'security groups'}])
++        self.assertEqual(client.get_all_instances.call_args_list,
++                         [call(instance_ids=['i_id1', 'i_id2'])])
++
++    def test_ensure_cleanup_with_uncleaned_instances_and_sg(self):
++        client = MagicMock()
++        resource_details = dict()
++        resource_details['instances'] = ["i_id1", "i_id2"]
++        aws = AWSAccount(None, 'myregion', client)
++        ati_err_msg = 'Instance not found'
++        client.terminate_instances.side_effect = [
++             Exception(ati_err_msg), Exception(ati_err_msg)]
++        client.get_all_security_groups.return_value = [
++            FakeSecurityGroup('sg_id1', ['i_id1', 'i_id2'])]
++        client.delete_security_group.side_effect = EC2ResponseError(
++            400, "Bad Request",
++            body={
++                "RequestID": "xxx-yyy-zz",
++                "Error": {
++                    "Code": "Security group failed to delete",
++                    "Message": "failed"
++                }
++            })
++        uncleaned_resources = aws.ensure_cleanup(resource_details)
++        self.assertEqual(client.get_all_instances.call_args_list,
++                         [call(instance_ids=['i_id1', 'i_id2'])])
++        self.assertEqual(
++            uncleaned_resources,
++            [{'errors': [
++                ('i_id1', "Exception('Instance not found',)"),
++                ('i_id2', "Exception('Instance not found',)")],
++                'resource': 'instances'},
++                {'errors': [
++                    ('sg_id1',
++                     "EC2ResponseError: 400 Bad Request\n{"
++                     "'RequestID': 'xxx-yyy-zz', 'Error': {"
++                     "'Message': 'failed', "
++                     "'Code': 'Security group failed to delete'}}")],
++                    'resource': 'security groups'}])
++
++
++class TestAWSCleanUpSecurityGroups(TestCase):
++    def test_delete_secgroup_not_in_use(self):
++        secgroup = [("sg-foo", ["foo", "bar"])]
++        instances = ["foo", "bar"]
++        client = MagicMock()
++        aws = AWSAccount(None, 'myregion', client)
++        failures = aws.cleanup_security_groups(instances, secgroup)
++        self.assertEqual(failures, [])
++        self.assertEqual(
++            client.delete_security_group.call_args, call(name='sg-foo'))
++
++    def test_dont_delete_secgroup_in_use(self):
++        secgroup = [("sg-foo", ["foo", "bar", "baz"])]
++        instances = ["foo", "bar"]
++        client = MagicMock()
++        aws = AWSAccount(None, 'myregion', client)
++        failures = aws.cleanup_security_groups(instances, secgroup)
++        self.assertEqual(client.delete_security_group.call_count, 0)
++        self.assertEqual(failures, [])
++
++    def test_return_failure_on_exception(self):
++        secgroup = [("sg-foo", ["foo", "bar"]), ("sg-bar", ["foo", "bar"])]
++        instances = ["foo", "bar"]
++        client = MagicMock(spec=["delete_security_group"])
++        client.delete_security_group.side_effect = EC2ResponseError(
++            400, "Bad Request",
++            body={
++                "RequestID": "xxx-yyy-zz",
++                "Error": {
++                    "Code": "InvalidSecurityGroup.NotFound",
++                    "Message": "failed"
++                }
++            })
++        aws = AWSAccount(None, 'myregion', client)
++        failures = aws.cleanup_security_groups(instances, secgroup)
++        self.assertEqual(client.delete_security_group.call_args_list,
++                         [call(name='sg-foo'), call(name='sg-bar')])
++        self.assertListEqual(failures,
++                             [('sg-foo',
++                               "EC2ResponseError: 400 Bad Request\n{"
++                               "'RequestID': 'xxx-yyy-zz', 'Error': {"
++                               "'Message': 'failed',"
++                               " 'Code': 'InvalidSecurityGroup.NotFound'}}"),
++                              ('sg-bar', "EC2ResponseError: 400 Bad Request\n{"
++                               "'RequestID': 'xxx-yyy-zz', 'Error': {"
++                               "'Message': 'failed',"
++                               " 'Code': 'InvalidSecurityGroup.NotFound'}}")])
++
++    def test_return_mixed_response(self):
++        secgroup = [("sg-foo", ["foo", "bar"]), ("sg-bar", ["fooX", "barX"])]
++        instances = ["foo", "bar", "fooX", "barX"]
++        client = MagicMock(spec=["delete_security_group"])
++        client.delete_security_group.side_effect = [
++            True, False]
++        aws = AWSAccount(None, 'myregion', client)
++        failures = aws.cleanup_security_groups(instances, secgroup)
++        self.assertEqual(failures, [('sg-bar', 'Failed to delete')])
++        self.assertEqual(client.delete_security_group.call_args_list,
++                         [call(name='sg-foo'), call(name='sg-bar')])
++
++    def test_instance_mapped_to_more_than_one_secgroup(self):
++        # Delete security group only if it has all the mapped instances
++        # specified in the instances list.
++        secgroup = [("sg-foo", ["foo", "bar"]), ("sg-bar", ["foo", "baz"])]
++        instances = ["foo", "bar"]
++        client = MagicMock()
++        aws = AWSAccount(None, 'myregion', client)
++        failures = aws.cleanup_security_groups(instances, secgroup)
++        self.assertEqual(failures, [])
++        self.assertEqual(aws.client.delete_security_group.call_count, 1)
++        self.assertEqual(
++            client.delete_security_group.call_args, call(name='sg-foo'))
++
++
++class TestContainsOnlyKnownInstances(TestCase):
++    def test_return_true_when_all_ids_known(self):
++        instances = ["foo", "bar", "qnx"]
++        sg_list = ["foo", "bar", "qnx"]
++        self.assertEqual(
++            contains_only_known_instances(instances, sg_list), True)
++
++    def test_return_true_known_ids_are_subset(self):
++        instances = ["foo", "bar", "qnx", "foo1"]
++        sg_list = ["foo", "bar", "qnx"]
++        self.assertEqual(
++            contains_only_known_instances(instances, sg_list), True)
++
++    def test_return_false_when_some_ids_unknown(self):
++        instances = ["foo", "qnx"]
++        sg_list = ["foo", "bar"]
++        self.assertEqual(
++            contains_only_known_instances(instances, sg_list),
++            False)
++
++
++class TestAttemptTerminateInstances(TestCase):
++    def test_return_error_on_exception(self):
++        client = MagicMock()
++        instances = ["foo", "bar"]
++        err_msg = "Instance not found"
++        aws = AWSAccount(None, 'myregion', client)
++        client.terminate_instances.side_effect = Exception(err_msg)
++        failed = attempt_terminate_instances(aws, instances)
++        self.assertEqual(failed,
++                         [('foo', "Exception('{}',)".format(err_msg)),
++                          ('bar', "Exception('{}',)".format(err_msg))])
++
++    def test_return_with_no_error(self):
++        client = MagicMock()
++        instances = ["foo", "bar"]
++        aws = AWSAccount(None, 'myregion', client)
++        client.terminate_instances.return_value = ["foo", "bar"]
++        failed = attempt_terminate_instances(aws, instances)
++        self.assertEqual(client.terminate_instances.call_args_list,
++                         [call(instance_ids=['foo']),
++                          call(instance_ids=['bar'])])
++        self.assertEqual(failed, [])
++
++    def test_returns_has_some_error(self):
++        client = MagicMock()
++        instances = ["foo", "bar"]
++        err_msg = "Instance not found"
++        aws = AWSAccount(None, 'myregion', client)
++        client.terminate_instances.side_effect = ["foo", Exception(err_msg)]
++        failed = attempt_terminate_instances(aws, instances)
++        self.assertEqual(client.terminate_instances.call_args_list, [
++            call(instance_ids=['foo']), call(instance_ids=['bar'])])
++        self.assertEqual(failed, [
++            ('bar', "Exception('Instance not found',)")])
++
++
++class TestGetSecurityGroups(TestCase):
++    def test_instance_managed_by_single_security_group(self):
++        client = MagicMock()
++        instance_sec_groups = (('i_id1', 'sg_id1'), ('i_id2', 'sg_id1'))
++        all_sec_groups = [FakeSecurityGroup('sg_id1', ['i_id1', 'i_id2'])]
++        aws = AWSAccount(None, 'myregion', client)
++        client.get_all_security_groups.return_value = all_sec_groups
++        with patch.object(
++                aws, 'iter_instance_security_groups',
++                autospec=True, return_value=instance_sec_groups):
++            sec_groups = aws.get_security_groups(['i_id1', 'i_id2'])
++            self.assertEqual(sec_groups, [('sg_id1', ['i_id1', 'i_id2'])])
++
++    def test_instance_managed_by_multiple_security_group(self):
++        client = MagicMock()
++        instance_sec_groups = (('i_id1', 'sg_id1'), ('i_id2', 'sg_id1'))
++        all_sec_groups = [FakeSecurityGroup(
++            'sg_id1', ['i_id1', 'i_id2']),
++            FakeSecurityGroup('sg_id2', ['i_id1'])]
++        aws = AWSAccount(None, 'myregion', client)
++        client.get_all_security_groups.return_value = all_sec_groups
++        with patch.object(
++                aws, 'iter_instance_security_groups',
++                autospec=True, return_value=instance_sec_groups):
++            sec_groups = aws.get_security_groups(['i_id1', 'i_id2'])
++            self.assertEqual(sec_groups,
++                             [('sg_id1', ['i_id1', 'i_id2']),
++                              ('sg_id2', ['i_id1'])])

juju-ci-tools

Merge lp:~viswesn/juju-ci-tools/ensure_provider_cleanup into lp:juju-ci-tools

Commit message

Description of the change

Preview Diff

Subscribers