Merge lp:~cjwatson/launchpad/buildmaster-cancel-properly into lp:launchpad
Status: | Merged |
---|---|
Approved by: | Colin Watson |
Approved revision: | no longer in the source branch. |
Merged at revision: | 16748 |
Proposed branch: | lp:~cjwatson/launchpad/buildmaster-cancel-properly |
Merge into: | lp:launchpad |
Prerequisite: | lp:~cjwatson/launchpad/buildstatus-aborted |
Diff against target: |
424 lines (+144/-98) 6 files modified
lib/lp/buildmaster/interactor.py (+0/-10) lib/lp/buildmaster/manager.py (+50/-17) lib/lp/buildmaster/tests/mock_slaves.py (+5/-1) lib/lp/buildmaster/tests/test_builder.py (+9/-13) lib/lp/buildmaster/tests/test_manager.py (+79/-54) lib/lp/soyuz/model/binarypackagebuild.py (+1/-3) |
To merge this branch: | bzr merge lp:~cjwatson/launchpad/buildmaster-cancel-properly |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
William Grant | code | Approve | |
Review via email: mp+177580@code.launchpad.net |
Commit message
Cancel builds by sending an "abort" command to the build slave, with a three-minute timeout after which we give up and rescue the builder. This should now work even on non-virtualised builders.
Description of the change
This is the second half of the master side of build cancellation, following on from the first half of the master side in https:/
The bug that had previously led test_abort to be disabled should be fixed by using scan-for-processes on the slave rather than killing a specific process.
We should be able to get rid of most of BuilderStatus.
For QA, we should test on dogfood with both virtualised and non-virtualised builders, using builds that (a) hang "normally" but are reasonably killable and (b) attempt to spawn processes faster than scan-for-processes can kill them. It might also be a good idea to attempt some kind of fake slave that can be told to enter the ABORTED state without the master asking it to do so.
59 + self.logger. info("Build '%s' failed to cancel" % build.title)
I'd mention the builder here to make logs searchable.
80 + builder. requestAbort( ) seconds( ) + self.CANCEL_TIMEOUT False)
81 + self.date_cancel = self._clock.
82 + return defer.succeed(
requestAbort returns a Deferred that you might want to handle.
Additionally, what will happen if I restart buildd-manager while the builder is ABORTING? It'll attempt to call requestAbort again, which will probably fail, but it'll be ignored because you don't handle the Deferred.