MAAS

Merge lp:~gz/maas/detailed_poweractionfail_1155175 into lp:~maas-committers/maas/trunk

detailed_poweractionfail_1155175
Merge into trunk

Proposed by Martin Packman on 2013-03-14

Status:

Merged

Approved by:

Martin Packman on 2013-03-27

Approved revision:

no longer in the source branch.

Merged at revision:

1456

Proposed branch:

lp:~gz/maas/detailed_poweractionfail_1155175

Merge into:

lp:~maas-committers/maas/trunk

Diff against target:

145 lines (+40/-32)

2 files modified

src/provisioningserver/power/poweraction.py (+18/-17)
src/provisioningserver/power/tests/test_poweraction.py (+22/-15)

To merge this branch:

bzr merge lp:~gz/maas/detailed_poweractionfail_1155175

High

Fix Released

Link a bug report

Reviewer	Review Type	Date Requested	Status
Julian Edwards (community)		2013-03-14	Approve on 2013-03-16
Review via email: mp+153425@code.launchpad.net

Commit message

Give more detailed error message on failures from power management scripts

Description of the change

Give more detailed error message on failures from power management scripts.

This changes how the scripts are called a little to take advantage of nicer wrappers included in Python 2.7, and customises the PowerActionFail exception type to stringify with more details.

How exactly we want the output shown is an open question, currently it's just on (at least one) new line in the error string, perhaps indenting or otherwise wrapping it would be preferable.

Docstring update was needed, part of it (taking **kwargs) is just wrong, and the return is entirely ignored in real code so should probably not be promised.

Some testing specifics were fiddled with, just to make a couple of changes slightly neater.
--

Revision history for this message

Julian Edwards (julian-edwards) wrote on 2013-03-15:

Thanks for the branch Martin, a definite improvement. I am a little concerned about one of the changes in [2] below so marking needs-fixing for that.

Cheers.

[1]

53 + output = subprocess.check_output(
54 + commands, shell=True, stderr=subprocess.STDOUT, close_fds=True)
55 + except Exception as e:

Catching a bare Exception is bad form, can you be more specific? The original caught OSError.

[2]

56 + raise PowerActionFail(self, e)
57 + # This output is only examined in tests, execute just ignores it
58 + return output

You've changed this from a PIPE to using the parent's FDs. I wrote the original code and although I can't remember why I used a PIPE I am pretty sure I had a good reason, so I reckon you should change it back to PIPE. I have a sneaking suspicion that some of the scripts caused a hang or something weird in the output from the celery process.
I should have added a comment about it :/

[3]

Our coding standards say to add a line break immediately after an opening parenthesis, so can you fix the following bits please:

86 + self.assertThat(lambda: pa.render_template(template),
87 + Raises(MatchesException(PowerActionFail,
88 + ".*name 'mac' is not defined at line \d+ column \d+ "
89 + "in file %s" % re.escape(template_name))))

snip....

95 def test_execute_raises_PowerActionFail_when_script_fails(self):
96 path = self._create_template_file("this_is_not_valid_shell")
97 - exception = self.assertRaises(PowerActionFail, self.run_action, path)
98 - self.assertEqual(
99 - "ether_wake failed with return code 127", exception.message)
100 + self.assertThat(lambda: self.run_action(path),
101 + Raises(MatchesException(PowerActionFail,
102 + "ether_wake failed.* return.* 127")))
103 +
104 + def test_execute_raises_PowerActionFail_with_output(self):
105 + path = self._create_template_file("echo reason for failure; exit 1")
106 + self.assertThat(lambda: self.run_action(path),
107 + Raises(MatchesException(PowerActionFail, ".*:\nreason for failure")))

Thanks for the branch Martin, a definite improvement.  I am a little concerned about one of the changes in [2] below so marking needs-fixing for that.

Cheers.

[1]

53	+            output = subprocess.check_output(
54	+                commands, shell=True, stderr=subprocess.STDOUT, close_fds=True)
55	+        except Exception as e:

Catching a bare Exception is bad form, can you be more specific?  The original caught OSError.

[2]

56	+            raise PowerActionFail(self, e)
57	+        # This output is only examined in tests, execute just ignores it
58	+        return output

You've changed this from a PIPE to using the parent's FDs.  I wrote the original code and although I can't remember why I used a PIPE I am pretty sure I had a good reason, so I reckon you should change it back to PIPE.  I have a sneaking suspicion that some of the scripts caused a hang or something weird in the output from the celery process.
I should have added a comment about it :/

[3]

Our coding standards say to add a line break immediately after an opening parenthesis, so can you fix the following bits please:

86	+        self.assertThat(lambda: pa.render_template(template),
87	+            Raises(MatchesException(PowerActionFail,
88	+                ".*name 'mac' is not defined at line \d+ column \d+ "
89	+                "in file %s" % re.escape(template_name))))

snip....

95	     def test_execute_raises_PowerActionFail_when_script_fails(self):
96	         path = self._create_template_file("this_is_not_valid_shell")
97	-        exception = self.assertRaises(PowerActionFail, self.run_action, path)
98	-        self.assertEqual(
99	-            "ether_wake failed with return code 127", exception.message)
100	+        self.assertThat(lambda: self.run_action(path),
101	+            Raises(MatchesException(PowerActionFail,
102	+                "ether_wake failed.* return.* 127")))
103	+
104	+    def test_execute_raises_PowerActionFail_with_output(self):
105	+        path = self._create_template_file("echo reason for failure; exit 1")
106	+        self.assertThat(lambda: self.run_action(path),
107	+            Raises(MatchesException(PowerActionFail, ".*:\nreason for failure")))

review: Needs Fixing

Revision history for this message

Martin Packman (gz) wrote on 2013-03-15:

> Catching a bare Exception is bad form, can you be more specific? The original
> caught OSError.

It's not as bad as it looks, because it's effectively a reraise, and excludes BaseException. That said, it should just be wrapping CalledProcessError really. The OSError clause in the old code was pretty bogus, it was untested, and would only be hit if /bin/sh doesn't exist... which is a broken enough situation that leaving the error unwrapped is probably preferable. I'll make the except more specific.

> You've changed this from a PIPE to using the parent's FDs. I wrote the
> original code and although I can't remember why I used a PIPE I am pretty sure
> I had a good reason, so I reckon you should change it back to PIPE. I have a
> sneaking suspicion that some of the scripts caused a hang or something weird
> in the output from the celery process.

So, the old behaviour was:
* Redirect stdout and stderr to separate pipes
* Select on stdout and stderr and fill up a buffer for each till the process exits
* Discard stdout and stderr

The new behaviour is:
* Redirect stdout and stderr to the same pipe
* Read the combined stdout/stderr output and wait till the process exits
* If the exitcode is non-zero, include stdout/stderr output in the exception
* Discard stdout/stderr output

I could write a test that demonstrates the code doesn't block with large outputs, but the subprocess module documentation is pretty clear (even if the interfaces themselves are not).

> Our coding standards say to add a line break immediately after an opening
> parenthesis, so can you fix the following bits please:

Done.

Revision history for this message

Julian Edwards (julian-edwards) wrote on 2013-03-16:

On Friday 15 Mar 2013 11:48:23 you wrote:
> The new behaviour is:
> * Redirect stdout and stderr to the same pipe

It's not a new pipe though, it's the parent's pipe.

> * Read the combined stdout/stderr output and wait till the process exits
> * If the exitcode is non-zero, include stdout/stderr output in the exception
> * Discard stdout/stderr output
>
> I could write a test that demonstrates the code doesn't block with large
> outputs, but the subprocess module documentation is pretty clear (even if
> the interfaces themselves are not).

Ok I don't want to block you on this - please test the power scripts in the
lab to make sure they pass muster and then go ahead and land it.

Thanks for the fix!

review: Approve

Revision history for this message

Martin Packman (gz) wrote on 2013-03-16:

> On Friday 15 Mar 2013 11:48:23 you wrote:
> > The new behaviour is:
> > * Redirect stdout and stderr to the same pipe
>
> It's not a new pipe though, it's the parent's pipe.

Okay, I think the confusion here is that subprocess.check_output implies stdout=subprocess.PIPE without it being explicitly specified. If stdout was inherited from the parent, how would it be a return value from the call?

Revision history for this message

Julian Edwards (julian-edwards) wrote on 2013-03-17:

On Saturday 16 Mar 2013 14:10:25 you wrote:
> > On Friday 15 Mar 2013 11:48:23 you wrote:
> > > The new behaviour is:
> > > * Redirect stdout and stderr to the same pipe
> >
> > It's not a new pipe though, it's the parent's pipe.
>
> Okay, I think the confusion here is that subprocess.check_output implies
> stdout=subprocess.PIPE without it being explicitly specified. If stdout was
> inherited from the parent, how would it be a return value from the call?

I don't think that assumption is valid. AFAICT it re-uses the parent's
stdout/stderr which is why I raised this in the first place :)

From the docs:

stdin, stdout and stderr specify the executed program’s standard input,
standard output and standard error file handles, respectively. Valid values
are PIPE, an existing file descriptor (a positive integer), an existing file
object, and None. PIPE indicates that a new pipe to the child should be
created. With the default settings of None, no redirection will occur; the
child’s file handles will be inherited from the parent

Revision history for this message

Martin Packman (gz) wrote on 2013-03-18:

Open /usr/lib/python2.7/subprocess.py on your machine and look at the implementation of check_output, it's a pretty simple helper:

    def check_output(*popenargs, **kwargs):
        ...
        if 'stdout' in kwargs:
            raise ValueError('stdout argument not allowed, it will be overridden.')
        process = Popen(stdout=PIPE, *popenargs, **kwargs)
        ...

Revision history for this message

Julian Edwards (julian-edwards) wrote on 2013-03-18:

On Monday 18 Mar 2013 11:00:47 you wrote:
> Open /usr/lib/python2.7/subprocess.py on your machine and look at the
> implementation of check_output, it's a pretty simple helper:
>
> def check_output(*popenargs, **kwargs):
> ...
> if 'stdout' in kwargs:
> raise ValueError('stdout argument not allowed, it will be
> overridden.') process = Popen(stdout=PIPE, *popenargs, **kwargs)
> ...

Soooo, the documentation is telling porkies. Awesome! :(

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Andres Rodriguez

Blake Rouse

Brendan Donegan

Dave Walker

Deepa

Enrique Chirivella Pérez

Fumihito YOSHIDA

MAAS Committers

Martin Packman

Mike Pontillo

james beedy

 === modified file 'src/provisioningserver/power/poweraction.py'
 --- src/provisioningserver/power/poweraction.py	2012-10-05 16:33:37 +0000
 +++ src/provisioningserver/power/poweraction.py	2013-03-15 11:55:24 +0000
@@ -31,6 +31,16 @@
  class PowerActionFail(Exception):
      """Raised when there's a problem executing a power script."""
++    def __init__(self, power_action, err):
++        self.power_action = power_action
++        self.err = err
++
++    def __str__(self):
++        message = "%s failed: %s" % (self.power_action.power_type, self.err)
++        if isinstance(self.err, subprocess.CalledProcessError) and self.err.output:
++            message += ":\n" + self.err.output.strip()
++        return message
++
  def get_power_templates_dir():
      """Get the power-templates directory from the config."""
@@ -99,32 +109,23 @@
              kwargs.update(self.get_extra_context())
              return template.substitute(kwargs)
          except NameError as error:
--            raise PowerActionFail(*error.args)
++            raise PowerActionFail(self, error)
      def run_shell(self, commands):
          """Execute raw shell script (as rendered from a template).
          :param commands: String containing shell script.
--        :param **kwargs: Keyword arguments are passed on to the template as
--            substitution values.
--        :return: Tuple of strings: stdout, stderr.
++        :raises: :class:`PowerActionFail`
          """
          # This might need retrying but it could be better to leave that
          # to the individual scripts.
          try:
--            proc = subprocess.Popen(
--                commands, shell=True, stdout=subprocess.PIPE,
--                stderr=subprocess.PIPE, close_fds=True)
--        except OSError as e:
--            raise PowerActionFail(e)
--
--        stdout, stderr = proc.communicate()
--        # TODO: log output on errors
--        code = proc.returncode
--        if code != 0:
--            raise PowerActionFail("%s failed with return code %s" % (
--                self.power_type, code))
--        return stdout, stderr
++            output = subprocess.check_output(
++                commands, shell=True, stderr=subprocess.STDOUT, close_fds=True)
++        except subprocess.CalledProcessError as e:
++            raise PowerActionFail(self, e)
++        # This output is only examined in tests, execute just ignores it
++        return output
      def execute(self, **kwargs):
          """Execute the template.
 === modified file 'src/provisioningserver/power/tests/test_poweraction.py'
 --- src/provisioningserver/power/tests/test_poweraction.py	2013-01-14 22:06:12 +0000
 +++ src/provisioningserver/power/tests/test_poweraction.py	2013-03-15 11:55:24 +0000
@@ -29,7 +29,8 @@
  from provisioningserver.utils import ShellTemplate
  from testtools.matchers import (
      FileContains,
--    MatchesRegex,
++    MatchesException,
++    Raises,
+     )
@@ -107,12 +108,11 @@
          pa = PowerAction(POWER_TYPE.WAKE_ON_LAN)
          template_name = factory.getRandomString()
          template = ShellTemplate("template: {{mac}}", name=template_name)
--        exception = self.assertRaises(
--            PowerActionFail, pa.render_template, template)
          self.assertThat(
--            exception.message, MatchesRegex(
--                "name 'mac' is not defined at line \d+ column \d+ "
--                "in file %s" % re.escape(template_name)))
++            lambda: pa.render_template(template),
++            Raises(MatchesException(PowerActionFail,
++                ".*name 'mac' is not defined at line \d+ column \d+ "
++                "in file %s" % re.escape(template_name))))
      def _create_template_file(self, template):
          """Create a temporary template file with the given contents."""
@@ -135,9 +135,16 @@
      def test_execute_raises_PowerActionFail_when_script_fails(self):
          path = self._create_template_file("this_is_not_valid_shell")
--        exception = self.assertRaises(PowerActionFail, self.run_action, path)
--        self.assertEqual(
--            "ether_wake failed with return code 127", exception.message)
++        self.assertThat(
++            lambda: self.run_action(path),
++            Raises(MatchesException(PowerActionFail,
++                "ether_wake failed.* return.* 127")))
++
++    def test_execute_raises_PowerActionFail_with_output(self):
++        path = self._create_template_file("echo reason for failure; exit 1")
++        self.assertThat(
++            lambda: self.run_action(path),
++            Raises(MatchesException(PowerActionFail, ".*:\nreason for failure")))
      def test_wake_on_lan_cannot_shut_down_node(self):
          pa = PowerAction(POWER_TYPE.WAKE_ON_LAN)
@@ -156,8 +163,8 @@
              action.get_template(), power_change='on',
              power_address='qemu://example.com/', system_id='mysystem',
              power_id='mysystem', username='me', virsh='echo')
--        stdout, stderr = action.run_shell(script)
--        self.assertIn("Got unknown power state from virsh", stderr)
++        output = action.run_shell(script)
++        self.assertIn("Got unknown power state from virsh", output)
      def test_fence_cdu_checks_state(self):
          # We can't test the fence_cdu template in detail (and it may be
@@ -170,8 +177,8 @@
              action.get_template(), power_change='on',
              power_address='mysystem', power_id='system',
              power_user='me', power_pass='me', fence_cdu='echo')
--        stdout, stderr = action.run_shell(script)
--        self.assertIn("Got unknown power state from fence_cdu", stderr)
++        output = action.run_shell(script)
++        self.assertIn("Got unknown power state from fence_cdu", output)
      def test_ipmi_checks_state(self):
          action = PowerAction(POWER_TYPE.IPMI)
@@ -180,8 +187,8 @@
              power_address='mystystem', power_user='me', power_pass='me',
              ipmipower='echo', ipmi_chassis_config='echo', config_dir='dir',
              ipmi_config='file.conf', power_driver='LAN')
--        stdout, stderr = action.run_shell(script)
--        self.assertIn("Got unknown power state from ipmipower", stderr)
++        output = action.run_shell(script)
++        self.assertIn("Got unknown power state from ipmipower", output)
      def configure_power_config_dir(self, path=None):
          """Configure POWER_CONFIG_DIR to `path`."""