worker: Properly retry on failures we think might be temporary
We're seeing a test run currently looping with this trace:
WARNING: Saw Temporary failure resolving in log, which is a sign of a temporary failure.
WARNING: Retrying in 5 minutes. Log follows:
[ ... log ... ]
gzip: /tmp/autopkgtest-work.j5qth004/out/log: No such file or directory
Traceback (most recent call last):
[ ... cut some bits of the trace ... ]
File "/home/ubuntu/autopkgtest-cloud/worker/worker", line 645, in request
process_output_dir(out_dir, pkgname, code)
File "/home/ubuntu/autopkgtest-cloud/worker/worker", line 172, in process_output_dir
subprocess.check_call(['gzip', '-9', os.path.join(dir, 'log')])
File "/usr/lib/python3.5/subprocess.py", line 581, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['gzip', '-9', '/tmp/autopkgtest-work.j5qth004/out/log']' returned non-zero exit status 1
But we should not be calling process_output_dir() when we're retrying.
That is to be called when we are about to upload the directory to swift.
What's happening is, we have this logic:
for retry in range(3):
<run the test>
<did it permanently fail?> { /* 1 */
<grep the log, to see if we think this might be transient>
<break if not, otherwise print a warning, *delete the
output directory* and retry>
}
<did it temporarily fail?> { /* 2 */
<grep the log, to see if we think this might be permanent>
<print a warning, delete the output directory and retry if
not, otherwise break>
} else { /* 3, passed */
<break, no more retries, upload the result>
}
We think it might be transient, so we clean up the output directory and
try to retry. But since we have two *separate* if statement here, the
second's else clause is entered - which is supposed to be the case for
if the run has passed cleanly - and we break out the loop, then go on to
try to upload the result. This fails, because we cleaned up the
directory.
Instead, we should have one if statement here. If we enter the first
case, for 'permanent' failures, we should never go on to enter any of
the others.