Merge lp:~vila/tarmac/1392339-verify-timeout into lp:tarmac

Proposed by Vincent Ladeuil
Status: Needs review
Proposed branch: lp:~vila/tarmac/1392339-verify-timeout
Merge into: lp:tarmac
Diff against target: 154 lines (+66/-12)
2 files modified
tarmac/plugins/command.py (+27/-12)
tarmac/plugins/tests/test_command.py (+39/-0)
To merge this branch: bzr merge lp:~vila/tarmac/1392339-verify-timeout
Reviewer Review Type Date Requested Status
dobey Needs Fixing
Review via email: mp+241694@code.launchpad.net

Commit message

Protect tarmac from committing suicide, allow config to specify a verify_timeout for verify_command

Description of the change

This adds a test to exercise the timeout handling for verify_command.

This test revealed several issues that are fixed here:

- by default 'verify_command' will not create a new process group so having
  a safe-guard in place avoid killing tarmac itself,

- the code was expecting a command.config.debug, protect against this
  attribute not being set,

- add a verify_timeout config option so the default timeout can be
  overridden.

That last one is used in the test itself and is needed on my project where
the verify_command includes a juju deployment that can take more than 15
minutes (the default timeout).

To post a comment you must log in.
435. By Vincent Ladeuil

Mark the bug as fixed

Revision history for this message
dobey (dobey) :
review: Needs Fixing
Revision history for this message
Vincent Ladeuil (vila) wrote :

I've reverted the copyright changes (even though they are generated from bzr history by a plugin, feedback welcome).

I've added tests for bogus verify_timeout values.

See more details in inline comments.

436. By Vincent Ladeuil

Add test to cover bogus verify_timeout values handling.

437. By Vincent Ladeuil

Revert the controversial change.

Revision history for this message
dobey (dobey) wrote :

Some replies below, but also, I just realized I forgot to mention that
as you're making the verify command timeout configurable, you'll also
need to update that plug-in's section in docs/introduction.txt to
include documentation on the configuration option.

On Fri, 2014-11-14 at 08:53 +0000, Vincent Ladeuil wrote:
> Not according to bzr annotate which is what
> https://launchpad.net/bzr-update-copyright uses to update this line.

The bzr log/annotate/etc… commands do not understand the structure of
copyright law. They simply show data. Maybe all the dates in the file
need some fixing, but I would not trust a script to do that
automatically, in general. For projects where all copyright is always
assigned to a single entity, maybe it would be reasonable, but tarmac is
not such a project.

> Regarding the format that needs to be used in that line, do you have a
> reference so a proper bug can be filed against that plugin ?

See "The Double-Date Problem" section of:
http://blog.apastyle.org/apastyle/2010/01/the-generic-reference-when.html

> > # Copyright 2009 Paul Hummer
>
> This also says that 2013-2014 is wrong no ?

No. It says that Paul owns copyright on some code he wrote, in the file,
in 2009. The changes in 2013 were not by him. And that copyright was not
assigned to Canonical. The tarmac project is not under the CLA, and
multiple parties can/may own copyright.

> As mentioned in the cover letter:
> - the code was expecting a command.config.debug, protect against this
> attribute not being set,

I understand what was changed. The commit message doesn't tell me why it
was changed though. A bug describing the problem in detail can help me
understand the problem you are trying to fix with this change, better.
And a separate branch will let me more easily review the two issues
separately (even if one depends on the other).

> The code related to command.config.debug is reached only when the
> timeout triggers. It *is* related and the timeout handling *cannot* be
> fixed without fixing that.

But the changes to the timeout value and add a config setting for the
timeout, are not required to fix the debug issue. It may be *related*,
but it is a *separate* issue. A separate bug is preferred for that, with
a separate branch (which the timeout fix can have a prerequisite on, if
it requires that fix), ideally with tests to validate the fix for the
debug problem.

I'm not against the change itself, if it's necessary. I just prefer to
review it separately, as it is a separate problem. The timeout fix might
depend on it, but it does not depend on the timeout fix. As such, a
separate bug, and a separate branch, with regression tests, is the best
way to resolve it, and makes it easier for both changes to be reviewed.
So please split the change into one branch for the debug fixes, and a
branch for the timeout fixes, which may depend on the debug fix branch.
Thanks.

Unmerged revisions

437. By Vincent Ladeuil

Revert the controversial change.

436. By Vincent Ladeuil

Add test to cover bogus verify_timeout values handling.

435. By Vincent Ladeuil

Mark the bug as fixed

434. By Vincent Ladeuil

Protect tarmac from being killed with verify_command is killed.
Add a verify_timeout for long commands.
Drive-by fix command.config.debug not being set.

433. By Vincent Ladeuil

Add a test to exercise the timeout handling for very_command. This reveals that the test runner is terminated as part of the process group of the verify_command.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'tarmac/plugins/command.py'
2--- tarmac/plugins/command.py 2013-10-30 22:24:25 +0000
3+++ tarmac/plugins/command.py 2014-11-14 08:53:16 +0000
4@@ -26,10 +26,10 @@
5 tempfile = None
6 time = None
7
8-# The TIMEOUT setting (expressed in seconds) affects how long a test will run
9-# before it is deemed to be hung, and then appropriately terminated.
10-# It's principal use is preventing a job from hanging indefinitely and
11-# backing up the queue.
12+# The default TIMEOUT setting (expressed in seconds) affects how long a test
13+# will run before it is deemed to be hung, and then appropriately terminated.
14+# It's principal use is preventing a job from hanging indefinitely and backing
15+# up the queue.
16 # e.g. Usage: TIMEOUT = 60 * 15
17 # This will set the timeout to 15 minutes.
18 TIMEOUT = 60 * 15
19@@ -57,7 +57,7 @@
20 """
21 Kill the process group leader identified by pid and other group members
22
23- The verify_command should set it's process to a process group leader.
24+ The verify_command should set its process to a process group leader.
25 """
26 try:
27 os.killpg(os.getpgid(pid), signal)
28@@ -108,7 +108,11 @@
29 per_file_timestamps=False)
30 os.chdir(export_dest)
31
32+ # We set a new progress group by default otherwise if the veify_command
33+ # doesn't set one, *we* are killed when killem() is called.
34 proc = subprocess.Popen(self.verify_command,
35+ # Ensure we don't get killed
36+ preexec_fn=os.setpgrp,
37 shell=True,
38 stdin=subprocess.PIPE,
39 stdout=subprocess.PIPE,
40@@ -121,8 +125,19 @@
41 # stderr for too long.
42 open_readers = set([proc.stdout, proc.stderr])
43
44+ verify_timeout = self.get_config('verify_timeout', TIMEOUT, target)
45+ try:
46+ verify_timeout = int(verify_timeout)
47+ except (ValueError, TypeError):
48+ self.logger.warn('verify_timeout value is not an int: %r'
49+ % verify_timeout)
50+ verify_timeout = TIMEOUT
51+
52+ debug = self.get_config('debug', False, command, target)
53+
54 while open_readers:
55- rlist, wlist, xlist = select.select(open_readers, [], [], TIMEOUT)
56+ rlist, wlist, xlist = select.select(open_readers, [], [],
57+ verify_timeout)
58
59 if len(rlist) == 0:
60 if proc.poll() is not None:
61@@ -130,7 +145,7 @@
62
63 self.logger.debug(
64 "Command appears to be hung. There has been no output for"
65- " %d seconds. Sending SIGTERM." % TIMEOUT)
66+ " %d seconds. Sending SIGTERM." % verify_timeout)
67 killem(proc.pid, signal.SIGTERM)
68 time.sleep(5)
69
70@@ -140,12 +155,12 @@
71
72 # Drain the subprocess's stdout and stderr.
73 out_rest = proc.stdout.read()
74- if command.config.debug:
75+ if debug:
76 sys.stdout.write(out_rest)
77 stdout.write(out_rest)
78
79 err_rest = proc.stderr.read()
80- if command.config.debug:
81+ if debug:
82 sys.stderr.write(err_rest)
83 stderr.write(err_rest)
84 break
85@@ -155,16 +170,16 @@
86 if chunk == "":
87 open_readers.remove(proc.stdout)
88 else:
89- if command.config.debug:
90+ if debug:
91 sys.stdout.write(chunk)
92 stdout.write(chunk)
93-
94+
95 if proc.stderr in rlist:
96 chunk = os.read(proc.stderr.fileno(), 1024)
97 if chunk == "":
98 open_readers.remove(proc.stderr)
99 else:
100- if command.config.debug:
101+ if debug:
102 sys.stderr.write(chunk)
103 stderr.write(chunk)
104
105
106=== modified file 'tarmac/plugins/tests/test_command.py'
107--- tarmac/plugins/tests/test_command.py 2013-10-31 19:09:08 +0000
108+++ tarmac/plugins/tests/test_command.py 2014-11-14 08:53:16 +0000
109@@ -60,6 +60,45 @@
110 proposal=self.proposal)
111
112 @patch('tarmac.plugins.command.export')
113+ def test_run_timeout_exceeded(self, mocked):
114+ """Test that an exceeded timeout fails correctly."""
115+ target = Thing(config=Thing(verify_command="/bin/sleep 10",
116+ verify_timeout='1'),
117+ tree=Thing(abspath=os.path.abspath))
118+ self.assertRaises(command.VerifyCommandFailed,
119+ self.plugin.run,
120+ command=self.command, target=target, source=None,
121+ proposal=self.proposal)
122+
123+ @patch('tarmac.plugins.command.export')
124+ def test_run_bogus_timeout(self, mocked):
125+ """Test that a bugs timeout is ignored.
126+
127+ Some strings cannot be converted to an int, a log warning is emitted
128+ for feedback.
129+ """
130+ target = Thing(config=Thing(verify_command="/bin/true",
131+ verify_timeout='forty two'),
132+ tree=Thing(abspath=os.path.abspath))
133+ self.plugin.run(
134+ command=self.command, target=target, source=None,
135+ proposal=self.proposal)
136+
137+ @patch('tarmac.plugins.command.export')
138+ def test_run_also_bogus_timeout(self, mocked):
139+ """Test that a bugs timeout is ignored.
140+
141+ A random type cannot be converted to an int, a log warning is emitted
142+ for feedback.
143+ """
144+ target = Thing(config=Thing(verify_command="/bin/true",
145+ verify_timeout=object()),
146+ tree=Thing(abspath=os.path.abspath))
147+ self.plugin.run(
148+ command=self.command, target=target, source=None,
149+ proposal=self.proposal)
150+
151+ @patch('tarmac.plugins.command.export')
152 def test_run_exported_tree(self, mocked):
153 """Test that the plug-in runs the command in an exported tree."""
154 target = Thing(config=Thing(

Subscribers

People subscribed via source and target branches