Merge lp:~gz/maas/detailed_poweractionfail_1155175 into lp:~maas-committers/maas/trunk

Proposed by Martin Packman
Status: Merged
Approved by: Martin Packman
Approved revision: no longer in the source branch.
Merged at revision: 1456
Proposed branch: lp:~gz/maas/detailed_poweractionfail_1155175
Merge into: lp:~maas-committers/maas/trunk
Diff against target: 145 lines (+40/-32)
2 files modified
src/provisioningserver/power/poweraction.py (+18/-17)
src/provisioningserver/power/tests/test_poweraction.py (+22/-15)
To merge this branch: bzr merge lp:~gz/maas/detailed_poweractionfail_1155175
Reviewer Review Type Date Requested Status
Julian Edwards (community) Approve
Review via email: mp+153425@code.launchpad.net

Commit message

Give more detailed error message on failures from power management scripts

Description of the change

Give more detailed error message on failures from power management scripts.

This changes how the scripts are called a little to take advantage of nicer wrappers included in Python 2.7, and customises the PowerActionFail exception type to stringify with more details.

How exactly we want the output shown is an open question, currently it's just on (at least one) new line in the error string, perhaps indenting or otherwise wrapping it would be preferable.

Docstring update was needed, part of it (taking **kwargs) is just wrong, and the return is entirely ignored in real code so should probably not be promised.

Some testing specifics were fiddled with, just to make a couple of changes slightly neater.
--

To post a comment you must log in.
Revision history for this message
Julian Edwards (julian-edwards) wrote :

Thanks for the branch Martin, a definite improvement. I am a little concerned about one of the changes in [2] below so marking needs-fixing for that.

Cheers.

[1]

53 + output = subprocess.check_output(
54 + commands, shell=True, stderr=subprocess.STDOUT, close_fds=True)
55 + except Exception as e:

Catching a bare Exception is bad form, can you be more specific? The original caught OSError.

[2]

56 + raise PowerActionFail(self, e)
57 + # This output is only examined in tests, execute just ignores it
58 + return output

You've changed this from a PIPE to using the parent's FDs. I wrote the original code and although I can't remember why I used a PIPE I am pretty sure I had a good reason, so I reckon you should change it back to PIPE. I have a sneaking suspicion that some of the scripts caused a hang or something weird in the output from the celery process.
I should have added a comment about it :/

[3]

Our coding standards say to add a line break immediately after an opening parenthesis, so can you fix the following bits please:

86 + self.assertThat(lambda: pa.render_template(template),
87 + Raises(MatchesException(PowerActionFail,
88 + ".*name 'mac' is not defined at line \d+ column \d+ "
89 + "in file %s" % re.escape(template_name))))

snip....

95 def test_execute_raises_PowerActionFail_when_script_fails(self):
96 path = self._create_template_file("this_is_not_valid_shell")
97 - exception = self.assertRaises(PowerActionFail, self.run_action, path)
98 - self.assertEqual(
99 - "ether_wake failed with return code 127", exception.message)
100 + self.assertThat(lambda: self.run_action(path),
101 + Raises(MatchesException(PowerActionFail,
102 + "ether_wake failed.* return.* 127")))
103 +
104 + def test_execute_raises_PowerActionFail_with_output(self):
105 + path = self._create_template_file("echo reason for failure; exit 1")
106 + self.assertThat(lambda: self.run_action(path),
107 + Raises(MatchesException(PowerActionFail, ".*:\nreason for failure")))

review: Needs Fixing
Revision history for this message
Martin Packman (gz) wrote :

> Catching a bare Exception is bad form, can you be more specific? The original
> caught OSError.

It's not as bad as it looks, because it's effectively a reraise, and excludes BaseException. That said, it should just be wrapping CalledProcessError really. The OSError clause in the old code was pretty bogus, it was untested, and would only be hit if /bin/sh doesn't exist... which is a broken enough situation that leaving the error unwrapped is probably preferable. I'll make the except more specific.

> You've changed this from a PIPE to using the parent's FDs. I wrote the
> original code and although I can't remember why I used a PIPE I am pretty sure
> I had a good reason, so I reckon you should change it back to PIPE. I have a
> sneaking suspicion that some of the scripts caused a hang or something weird
> in the output from the celery process.

So, the old behaviour was:
* Redirect stdout and stderr to separate pipes
* Select on stdout and stderr and fill up a buffer for each till the process exits
* Discard stdout and stderr

The new behaviour is:
* Redirect stdout and stderr to the same pipe
* Read the combined stdout/stderr output and wait till the process exits
* If the exitcode is non-zero, include stdout/stderr output in the exception
* Discard stdout/stderr output

I could write a test that demonstrates the code doesn't block with large outputs, but the subprocess module documentation is pretty clear (even if the interfaces themselves are not).

> Our coding standards say to add a line break immediately after an opening
> parenthesis, so can you fix the following bits please:

Done.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

On Friday 15 Mar 2013 11:48:23 you wrote:
> The new behaviour is:
> * Redirect stdout and stderr to the same pipe

It's not a new pipe though, it's the parent's pipe.

> * Read the combined stdout/stderr output and wait till the process exits
> * If the exitcode is non-zero, include stdout/stderr output in the exception
> * Discard stdout/stderr output
>
> I could write a test that demonstrates the code doesn't block with large
> outputs, but the subprocess module documentation is pretty clear (even if
> the interfaces themselves are not).

Ok I don't want to block you on this - please test the power scripts in the
lab to make sure they pass muster and then go ahead and land it.

Thanks for the fix!

review: Approve
Revision history for this message
Martin Packman (gz) wrote :

> On Friday 15 Mar 2013 11:48:23 you wrote:
> > The new behaviour is:
> > * Redirect stdout and stderr to the same pipe
>
> It's not a new pipe though, it's the parent's pipe.

Okay, I think the confusion here is that subprocess.check_output implies stdout=subprocess.PIPE without it being explicitly specified. If stdout was inherited from the parent, how would it be a return value from the call?

Revision history for this message
Julian Edwards (julian-edwards) wrote :

On Saturday 16 Mar 2013 14:10:25 you wrote:
> > On Friday 15 Mar 2013 11:48:23 you wrote:
> > > The new behaviour is:
> > > * Redirect stdout and stderr to the same pipe
> >
> > It's not a new pipe though, it's the parent's pipe.
>
> Okay, I think the confusion here is that subprocess.check_output implies
> stdout=subprocess.PIPE without it being explicitly specified. If stdout was
> inherited from the parent, how would it be a return value from the call?

I don't think that assumption is valid. AFAICT it re-uses the parent's
stdout/stderr which is why I raised this in the first place :)

From the docs:

stdin, stdout and stderr specify the executed program’s standard input,
standard output and standard error file handles, respectively. Valid values
are PIPE, an existing file descriptor (a positive integer), an existing file
object, and None. PIPE indicates that a new pipe to the child should be
created. With the default settings of None, no redirection will occur; the
child’s file handles will be inherited from the parent

Revision history for this message
Martin Packman (gz) wrote :

Open /usr/lib/python2.7/subprocess.py on your machine and look at the implementation of check_output, it's a pretty simple helper:

    def check_output(*popenargs, **kwargs):
        ...
        if 'stdout' in kwargs:
            raise ValueError('stdout argument not allowed, it will be overridden.')
        process = Popen(stdout=PIPE, *popenargs, **kwargs)
        ...

Revision history for this message
Julian Edwards (julian-edwards) wrote :

On Monday 18 Mar 2013 11:00:47 you wrote:
> Open /usr/lib/python2.7/subprocess.py on your machine and look at the
> implementation of check_output, it's a pretty simple helper:
>
> def check_output(*popenargs, **kwargs):
> ...
> if 'stdout' in kwargs:
> raise ValueError('stdout argument not allowed, it will be
> overridden.') process = Popen(stdout=PIPE, *popenargs, **kwargs)
> ...

Soooo, the documentation is telling porkies. Awesome! :(

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/provisioningserver/power/poweraction.py'
2--- src/provisioningserver/power/poweraction.py 2012-10-05 16:33:37 +0000
3+++ src/provisioningserver/power/poweraction.py 2013-03-15 11:55:24 +0000
4@@ -31,6 +31,16 @@
5 class PowerActionFail(Exception):
6 """Raised when there's a problem executing a power script."""
7
8+ def __init__(self, power_action, err):
9+ self.power_action = power_action
10+ self.err = err
11+
12+ def __str__(self):
13+ message = "%s failed: %s" % (self.power_action.power_type, self.err)
14+ if isinstance(self.err, subprocess.CalledProcessError) and self.err.output:
15+ message += ":\n" + self.err.output.strip()
16+ return message
17+
18
19 def get_power_templates_dir():
20 """Get the power-templates directory from the config."""
21@@ -99,32 +109,23 @@
22 kwargs.update(self.get_extra_context())
23 return template.substitute(kwargs)
24 except NameError as error:
25- raise PowerActionFail(*error.args)
26+ raise PowerActionFail(self, error)
27
28 def run_shell(self, commands):
29 """Execute raw shell script (as rendered from a template).
30
31 :param commands: String containing shell script.
32- :param **kwargs: Keyword arguments are passed on to the template as
33- substitution values.
34- :return: Tuple of strings: stdout, stderr.
35+ :raises: :class:`PowerActionFail`
36 """
37 # This might need retrying but it could be better to leave that
38 # to the individual scripts.
39 try:
40- proc = subprocess.Popen(
41- commands, shell=True, stdout=subprocess.PIPE,
42- stderr=subprocess.PIPE, close_fds=True)
43- except OSError as e:
44- raise PowerActionFail(e)
45-
46- stdout, stderr = proc.communicate()
47- # TODO: log output on errors
48- code = proc.returncode
49- if code != 0:
50- raise PowerActionFail("%s failed with return code %s" % (
51- self.power_type, code))
52- return stdout, stderr
53+ output = subprocess.check_output(
54+ commands, shell=True, stderr=subprocess.STDOUT, close_fds=True)
55+ except subprocess.CalledProcessError as e:
56+ raise PowerActionFail(self, e)
57+ # This output is only examined in tests, execute just ignores it
58+ return output
59
60 def execute(self, **kwargs):
61 """Execute the template.
62
63=== modified file 'src/provisioningserver/power/tests/test_poweraction.py'
64--- src/provisioningserver/power/tests/test_poweraction.py 2013-01-14 22:06:12 +0000
65+++ src/provisioningserver/power/tests/test_poweraction.py 2013-03-15 11:55:24 +0000
66@@ -29,7 +29,8 @@
67 from provisioningserver.utils import ShellTemplate
68 from testtools.matchers import (
69 FileContains,
70- MatchesRegex,
71+ MatchesException,
72+ Raises,
73 )
74
75
76@@ -107,12 +108,11 @@
77 pa = PowerAction(POWER_TYPE.WAKE_ON_LAN)
78 template_name = factory.getRandomString()
79 template = ShellTemplate("template: {{mac}}", name=template_name)
80- exception = self.assertRaises(
81- PowerActionFail, pa.render_template, template)
82 self.assertThat(
83- exception.message, MatchesRegex(
84- "name 'mac' is not defined at line \d+ column \d+ "
85- "in file %s" % re.escape(template_name)))
86+ lambda: pa.render_template(template),
87+ Raises(MatchesException(PowerActionFail,
88+ ".*name 'mac' is not defined at line \d+ column \d+ "
89+ "in file %s" % re.escape(template_name))))
90
91 def _create_template_file(self, template):
92 """Create a temporary template file with the given contents."""
93@@ -135,9 +135,16 @@
94
95 def test_execute_raises_PowerActionFail_when_script_fails(self):
96 path = self._create_template_file("this_is_not_valid_shell")
97- exception = self.assertRaises(PowerActionFail, self.run_action, path)
98- self.assertEqual(
99- "ether_wake failed with return code 127", exception.message)
100+ self.assertThat(
101+ lambda: self.run_action(path),
102+ Raises(MatchesException(PowerActionFail,
103+ "ether_wake failed.* return.* 127")))
104+
105+ def test_execute_raises_PowerActionFail_with_output(self):
106+ path = self._create_template_file("echo reason for failure; exit 1")
107+ self.assertThat(
108+ lambda: self.run_action(path),
109+ Raises(MatchesException(PowerActionFail, ".*:\nreason for failure")))
110
111 def test_wake_on_lan_cannot_shut_down_node(self):
112 pa = PowerAction(POWER_TYPE.WAKE_ON_LAN)
113@@ -156,8 +163,8 @@
114 action.get_template(), power_change='on',
115 power_address='qemu://example.com/', system_id='mysystem',
116 power_id='mysystem', username='me', virsh='echo')
117- stdout, stderr = action.run_shell(script)
118- self.assertIn("Got unknown power state from virsh", stderr)
119+ output = action.run_shell(script)
120+ self.assertIn("Got unknown power state from virsh", output)
121
122 def test_fence_cdu_checks_state(self):
123 # We can't test the fence_cdu template in detail (and it may be
124@@ -170,8 +177,8 @@
125 action.get_template(), power_change='on',
126 power_address='mysystem', power_id='system',
127 power_user='me', power_pass='me', fence_cdu='echo')
128- stdout, stderr = action.run_shell(script)
129- self.assertIn("Got unknown power state from fence_cdu", stderr)
130+ output = action.run_shell(script)
131+ self.assertIn("Got unknown power state from fence_cdu", output)
132
133 def test_ipmi_checks_state(self):
134 action = PowerAction(POWER_TYPE.IPMI)
135@@ -180,8 +187,8 @@
136 power_address='mystystem', power_user='me', power_pass='me',
137 ipmipower='echo', ipmi_chassis_config='echo', config_dir='dir',
138 ipmi_config='file.conf', power_driver='LAN')
139- stdout, stderr = action.run_shell(script)
140- self.assertIn("Got unknown power state from ipmipower", stderr)
141+ output = action.run_shell(script)
142+ self.assertIn("Got unknown power state from ipmipower", output)
143
144 def configure_power_config_dir(self, path=None):
145 """Configure POWER_CONFIG_DIR to `path`."""