Merge lp:~jtv/launchpad/bug-717969 into lp:launchpad
- bug-717969
- Merge into devel
Status: | Merged | ||||||||
---|---|---|---|---|---|---|---|---|---|
Merged at revision: | 14459 | ||||||||
Proposed branch: | lp:~jtv/launchpad/bug-717969 | ||||||||
Merge into: | lp:launchpad | ||||||||
Diff against target: |
1482 lines (+430/-237) 12 files modified
lib/lp/buildmaster/interfaces/builder.py (+6/-4) lib/lp/buildmaster/manager.py (+78/-50) lib/lp/buildmaster/model/builder.py (+17/-8) lib/lp/buildmaster/model/buildfarmjobbehavior.py (+63/-33) lib/lp/buildmaster/model/packagebuild.py (+91/-57) lib/lp/buildmaster/tests/test_builder.py (+1/-1) lib/lp/buildmaster/tests/test_manager.py (+127/-40) lib/lp/buildmaster/tests/test_packagebuild.py (+10/-20) lib/lp/code/model/tests/test_sourcepackagerecipebuild.py (+6/-8) lib/lp/services/database/transaction_policy.py (+5/-2) lib/lp/soyuz/tests/test_binarypackagebuild.py (+11/-4) lib/lp/translations/model/translationtemplatesbuildbehavior.py (+15/-10) |
||||||||
To merge this branch: | bzr merge lp:~jtv/launchpad/bug-717969 | ||||||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Julian Edwards (community) | Approve | ||
Review via email: mp+83022@code.launchpad.net |
Commit message
Read-only transactions by default in buildmaster.
Description of the change
= Bug 717969 =
This is an oversized branch. Don't just pick it up out of a sense of duty; if it's too big or you're not certain about the domain-specific stuff, contact one of the Soyuz experts — Julian, Steve, or William.
The buildmaster was showing very strange symptoms. In a nutshell, cases where observed where it went through steps A, B, and C; we could see that steps A and C were completed successfully; but the expected results from step B were just not there.
The cause turns out to be a dangerous mix of global state: our database transaction management is global to a thread (all of the thread's changes are committed or all are aborted) but our use of twisted makes a single python thread/process flit back and forth between servicing multiple logical threads of execution.
Each of those threads of execution might, from its point of view: make some changes to ORM-backed objects, then initiate an asynchronous request (talking to the librarian or a build slave), block for the result to arrive, make some more ORM changes, block again, and later commit or abort the database transaction.
That is purely from the point of view of a logical thread of execution. Twisted experts are quick to point out that “nothing blocks” in Twisted, but that is exactly the problem: while one logical thread of execution “blocks,” another gets to execute. And who's to say whether that other one might not commit or abort the first one's changes prematurely? With the numbers of builders we drive, it's bound to happen regularly.
We discussed a few architectural solutions: farming off the ORM changes to a worker thread, doing all the work in threads, and so on. The real problem is that the ORM changes are inlined in the buildmaster, but largely hidden away in the various build-related class hierarchies. Twisted is meant for glue between systems, not for interleaving complex and dispersed stateful machinery with the glue's callback structure.
Ideally, we should separate the “glue” from the “machinery.” But it's complex and fragile, so the next step from here is to get the code under control to the point where we can reason reliably about it. Once we know more about what the code really does, we'll have more freedom to reorganize it.
This branch represents that next step from here: make the buildmaster run in read-only transaction policy. It will not be able to make any changes to the database (or ORM-backed objects), except in sections of code that are explicitly wrapped in read-write transaction policies. Now we know exactly where the buildmaster may change the database — doing it anywhere else would be an error. We keep the read-write sections absolutely minimal, and try to avoid method calls where Twisted context switches might hide. Unfortunately neither Twisted nor Storm seems to have a facility for forbidding them altogether.
Every read-write section commits immediately. That means more fine-grained state changes. I can't be 100% certain that early commits will not produce any unwanted object states, and I can't be 100% certain that the read-write policies cover all code paths that need them. But we've run several types of build through the patched build farm, and we've stress-tested it against about 5 concurrent builders working all-out. As expected, we found code that still lacked an explicit read-write policy — but only one of them. There may be more, but only production use will find them all.
For Q/A, we once again perform builds on staging and/or dogfood, of as many kinds as we can. Include concurrent builds, and successes as well as failures. Verify that the build products (tarballs, packages, changes files, build logs) all go into the librarian, and that the builds end up in a proper terminal state. The contents of the build logs should be consistent with that end state.
As for tests… sorry, I just run them all!
No lint that I could help, copyrights updated, imports formatted as per policy. Thank you for reading this far.
Jeroen
Steve Kowalik (stevenk) wrote : | # |
Jeroen T. Vermeulen (jtv) wrote : | # |
> Please also link bug 785679 to this MP, you've fixed it too since you've re-
> enabled the test.
Done. Apparently the test didn't use our custom setup for a Twisted-enabled test.
Julian Edwards (julian-edwards) wrote : | # |
Thanks a *bunch* for tackling this, I realise it's a bit soul-sapping. If it helps, know that I have spent most of the day going through your code to make sure it's all OK :)
319 + transaction.
320 + with DatabaseTransac
321 + library_file = getUtility(
322 + filename, bytes_written, out_file,
323 + contentType=
324 + restricted=private)
325 + transaction.
Should that commit at 319 be an abort()? Or are you trying to make it bomb out deliberately if there's data getting written where it should not be?
349 transaction.
350 + with DatabaseTransac
351 + candidate.
352 + transaction.
Same here.
And in fact in quite a few places so I won't list them all.
586 -from lp.registry.
587 - PackagePublishi
588 - )
589 +from lp.registry.
Why? :)
634 - d = method(librarian, slave_status, logger, send_notification)
635 - return d
636 + return None
637 + else:
638 + return method(librarian, slave_status, logger, send_notification)
I'd prefer if you left this with the local "d" variable because it makes it clear it's returning a Deferred. This is important because anything that reminds the caller is a good thing, especially morons like me who cocked it up once :)
945 - self.useFixture
All of the other tests don't need this, and it's pretty slow to set up, which is why I recently moved it out of setUp(). Is there a good reason to put it back?
Finally - I'd love to see a test that tries to recreate the original bug. I think it's actually quite straightforward, I'll try and explain the steps:
* Set up a build connected to a BrokenSlave so that when it's polled it'll abort().
* Create and get a build at the stage where it's inside storeBuildInfo() but waiting on the Deferred to fire that returns the log.
* Fire the reactor up so the outstanding Deferreds are processed - the abort() should get called before the log Deferred fires. That will abort all the work done in _handleStatus_OK() so is easy to test for.
Hope that makes sense!
Jeroen T. Vermeulen (jtv) wrote : | # |
> Thanks a *bunch* for tackling this, I realise it's a bit soul-sapping. If it
> helps, know that I have spent most of the day going through your code to make
> sure it's all OK :)
That helps a lot. And I'm well aware that reviewing this stuff is more soul-sapping than writing it is. Thank you for wading through your 1290 lines of personal hell!
> 319 + transaction.
> 320 + with DatabaseTransac
> 321 + library_file =
> getUtility(
> 322 + filename, bytes_written, out_file,
> 323 + contentType=
> 324 + restricted=private)
> 325 + transaction.
>
> Should that commit at 319 be an abort()? Or are you trying to make it bomb
> out deliberately if there's data getting written where it should not be?
And there you put your finger on the deciding consideration, which I had totally forgotten in the meantime. Yes, thanks to DatabaseTransac
(Also, I believe the ORM makes abort slower because it needs to scan the live objects for changes that need undoing).
> 349 transaction.
> 350 + with DatabaseTransac
> 351 + candidate.
> 352 + transaction.
>
> Same here.
>
> And in fact in quite a few places so I won't list them all.
I went with the commit throughout, which only required 2 changes.
> 586 -from lp.registry.
> 587 - PackagePublishi
> 588 - )
> 589 +from lp.registry.
>
> Why? :)
Automated: utilities/
> 634 - d = method(librarian, slave_status, logger,
> send_notification)
> 635 - return d
> 636 + return None
> 637 + else:
> 638 + return method(librarian, slave_status, logger,
> send_notification)
>
> I'd prefer if you left this with the local "d" variable because it makes it
> clear it's returning a Deferred. This is important because anything that
> reminds the caller is a good thing, especially morons like me who cocked it up
> once :)
You're right. Had I come across this a bit later in the branch, when I had more grey hairs but also more twisted/
> 945 - self.useFixture
>
> All of the other tests don't need this, and it's pretty slow to set up, which
> is why I recently moved it out of setUp(). Is there a good reason to put it
> back?
Probably a misguided step in changing the test cases over from that weirdly-named Twisted thing to our own TestCaseWithFac
> Finally - I'd love to see a test that tries to recreate the original bug. I
> think it's actually quite s...
Julian Edwards (julian-edwards) : | # |
Preview Diff
1 | === modified file 'lib/lp/buildmaster/interfaces/builder.py' |
2 | --- lib/lp/buildmaster/interfaces/builder.py 2011-07-14 21:49:37 +0000 |
3 | +++ lib/lp/buildmaster/interfaces/builder.py 2011-12-08 16:19:26 +0000 |
4 | @@ -1,4 +1,4 @@ |
5 | -# Copyright 2009 Canonical Ltd. This software is licensed under the |
6 | +# Copyright 2009-2011 Canonical Ltd. This software is licensed under the |
7 | # GNU Affero General Public License version 3 (see the file LICENSE). |
8 | |
9 | # pylint: disable-msg=E0211,E0213 |
10 | @@ -25,10 +25,10 @@ |
11 | export_as_webservice_entry, |
12 | export_read_operation, |
13 | exported, |
14 | + operation_for_version, |
15 | operation_parameters, |
16 | + operation_returns_collection_of, |
17 | operation_returns_entry, |
18 | - operation_returns_collection_of, |
19 | - operation_for_version, |
20 | ) |
21 | from lazr.restful.fields import ( |
22 | Reference, |
23 | @@ -50,12 +50,12 @@ |
24 | from lp.app.validators.name import name_validator |
25 | from lp.app.validators.url import builder_url_validator |
26 | from lp.registry.interfaces.role import IHasOwner |
27 | -from lp.soyuz.interfaces.processor import IProcessor |
28 | from lp.services.fields import ( |
29 | Description, |
30 | PersonChoice, |
31 | Title, |
32 | ) |
33 | +from lp.soyuz.interfaces.processor import IProcessor |
34 | |
35 | |
36 | class BuildDaemonError(Exception): |
37 | @@ -195,6 +195,8 @@ |
38 | |
39 | def setSlaveForTesting(proxy): |
40 | """Sets the RPC proxy through which to operate the build slave.""" |
41 | + # XXX JeroenVermeulen 2011-11-09, bug=888010: Don't use this. |
42 | + # It's a trap. See bug for details. |
43 | |
44 | def verifySlaveBuildCookie(slave_build_id): |
45 | """Verify that a slave's build cookie is consistent. |
46 | |
47 | === modified file 'lib/lp/buildmaster/manager.py' |
48 | --- lib/lp/buildmaster/manager.py 2011-10-31 09:23:47 +0000 |
49 | +++ lib/lp/buildmaster/manager.py 2011-12-08 16:19:26 +0000 |
50 | @@ -1,4 +1,4 @@ |
51 | -# Copyright 2009 Canonical Ltd. This software is licensed under the |
52 | +# Copyright 2009-2011 Canonical Ltd. This software is licensed under the |
53 | # GNU Affero General Public License version 3 (see the file LICENSE). |
54 | |
55 | """Soyuz buildd slave manager logic.""" |
56 | @@ -23,10 +23,6 @@ |
57 | from zope.component import getUtility |
58 | |
59 | from lp.buildmaster.enums import BuildStatus |
60 | -from lp.buildmaster.interfaces.buildfarmjobbehavior import ( |
61 | - BuildBehaviorMismatch, |
62 | - ) |
63 | -from lp.buildmaster.model.builder import Builder |
64 | from lp.buildmaster.interfaces.builder import ( |
65 | BuildDaemonError, |
66 | BuildSlaveFailure, |
67 | @@ -34,6 +30,11 @@ |
68 | CannotFetchFile, |
69 | CannotResumeHost, |
70 | ) |
71 | +from lp.buildmaster.interfaces.buildfarmjobbehavior import ( |
72 | + BuildBehaviorMismatch, |
73 | + ) |
74 | +from lp.buildmaster.model.builder import Builder |
75 | +from lp.services.database.transaction_policy import DatabaseTransactionPolicy |
76 | |
77 | |
78 | BUILDD_MANAGER_LOG_NAME = "slave-scanner" |
79 | @@ -111,13 +112,17 @@ |
80 | # algorithm for polling. |
81 | SCAN_INTERVAL = 15 |
82 | |
83 | - def __init__(self, builder_name, logger): |
84 | + def __init__(self, builder_name, logger, clock=None): |
85 | self.builder_name = builder_name |
86 | self.logger = logger |
87 | + if clock is None: |
88 | + clock = reactor |
89 | + self._clock = clock |
90 | |
91 | def startCycle(self): |
92 | """Scan the builder and dispatch to it or deal with failures.""" |
93 | self.loop = LoopingCall(self.singleCycle) |
94 | + self.loop.clock = self._clock |
95 | self.stopping_deferred = self.loop.start(self.SCAN_INTERVAL) |
96 | return self.stopping_deferred |
97 | |
98 | @@ -138,51 +143,58 @@ |
99 | 1. Print the error in the log |
100 | 2. Increment and assess failure counts on the builder and job. |
101 | """ |
102 | - # Make sure that pending database updates are removed as it |
103 | - # could leave the database in an inconsistent state (e.g. The |
104 | - # job says it's running but the buildqueue has no builder set). |
105 | + # Since this is a failure path, we could be in a broken |
106 | + # transaction. Get us a fresh one. |
107 | transaction.abort() |
108 | |
109 | # If we don't recognise the exception include a stack trace with |
110 | # the error. |
111 | error_message = failure.getErrorMessage() |
112 | - if failure.check( |
113 | + familiar_error = failure.check( |
114 | BuildSlaveFailure, CannotBuild, BuildBehaviorMismatch, |
115 | - CannotResumeHost, BuildDaemonError, CannotFetchFile): |
116 | - self.logger.info("Scanning %s failed with: %s" % ( |
117 | - self.builder_name, error_message)) |
118 | + CannotResumeHost, BuildDaemonError, CannotFetchFile) |
119 | + if familiar_error: |
120 | + self.logger.info( |
121 | + "Scanning %s failed with: %s", |
122 | + self.builder_name, error_message) |
123 | else: |
124 | - self.logger.info("Scanning %s failed with: %s\n%s" % ( |
125 | + self.logger.info( |
126 | + "Scanning %s failed with: %s\n%s", |
127 | self.builder_name, failure.getErrorMessage(), |
128 | - failure.getTraceback())) |
129 | + failure.getTraceback()) |
130 | |
131 | # Decide if we need to terminate the job or fail the |
132 | # builder. |
133 | try: |
134 | builder = get_builder(self.builder_name) |
135 | - builder.gotFailure() |
136 | - if builder.currentjob is not None: |
137 | - build_farm_job = builder.getCurrentBuildFarmJob() |
138 | - build_farm_job.gotFailure() |
139 | - self.logger.info( |
140 | - "builder %s failure count: %s, " |
141 | - "job '%s' failure count: %s" % ( |
142 | + transaction.commit() |
143 | + |
144 | + with DatabaseTransactionPolicy(read_only=False): |
145 | + builder.gotFailure() |
146 | + |
147 | + if builder.currentjob is None: |
148 | + self.logger.info( |
149 | + "Builder %s failed a probe, count: %s", |
150 | + self.builder_name, builder.failure_count) |
151 | + else: |
152 | + build_farm_job = builder.getCurrentBuildFarmJob() |
153 | + build_farm_job.gotFailure() |
154 | + self.logger.info( |
155 | + "builder %s failure count: %s, " |
156 | + "job '%s' failure count: %s", |
157 | self.builder_name, |
158 | builder.failure_count, |
159 | build_farm_job.title, |
160 | - build_farm_job.failure_count)) |
161 | - else: |
162 | - self.logger.info( |
163 | - "Builder %s failed a probe, count: %s" % ( |
164 | - self.builder_name, builder.failure_count)) |
165 | - assessFailureCounts(builder, failure.getErrorMessage()) |
166 | - transaction.commit() |
167 | + build_farm_job.failure_count) |
168 | + |
169 | + assessFailureCounts(builder, failure.getErrorMessage()) |
170 | + transaction.commit() |
171 | except: |
172 | # Catastrophic code failure! Not much we can do. |
173 | + transaction.abort() |
174 | self.logger.error( |
175 | "Miserable failure when trying to examine failure counts:\n", |
176 | exc_info=True) |
177 | - transaction.abort() |
178 | |
179 | def checkCancellation(self, builder): |
180 | """See if there is a pending cancellation request. |
181 | @@ -236,14 +248,9 @@ |
182 | """ |
183 | # We need to re-fetch the builder object on each cycle as the |
184 | # Storm store is invalidated over transaction boundaries. |
185 | - |
186 | self.builder = get_builder(self.builder_name) |
187 | |
188 | def status_updated(ignored): |
189 | - # Commit the changes done while possibly rescuing jobs, to |
190 | - # avoid holding table locks. |
191 | - transaction.commit() |
192 | - |
193 | # See if we think there's an active build on the builder. |
194 | buildqueue = self.builder.getBuildQueue() |
195 | |
196 | @@ -253,14 +260,10 @@ |
197 | return self.builder.updateBuild(buildqueue) |
198 | |
199 | def build_updated(ignored): |
200 | - # Commit changes done while updating the build, to avoid |
201 | - # holding table locks. |
202 | - transaction.commit() |
203 | - |
204 | # If the builder is in manual mode, don't dispatch anything. |
205 | if self.builder.manual: |
206 | self.logger.debug( |
207 | - '%s is in manual mode, not dispatching.' % |
208 | + '%s is in manual mode, not dispatching.', |
209 | self.builder.name) |
210 | return |
211 | |
212 | @@ -278,22 +281,33 @@ |
213 | job = self.builder.currentjob |
214 | if job is not None and not self.builder.builderok: |
215 | self.logger.info( |
216 | - "%s was made unavailable, resetting attached " |
217 | - "job" % self.builder.name) |
218 | - job.reset() |
219 | + "%s was made unavailable; resetting attached job.", |
220 | + self.builder.name) |
221 | transaction.commit() |
222 | + with DatabaseTransactionPolicy(read_only=False): |
223 | + job.reset() |
224 | + transaction.commit() |
225 | return |
226 | |
227 | # See if there is a job we can dispatch to the builder slave. |
228 | |
229 | + # XXX JeroenVermeulen 2011-10-11, bug=872112: The job's |
230 | + # failure count will be reset once the job has started |
231 | + # successfully. Because of intervening commits, you may see |
232 | + # a build with a nonzero failure count that's actually going |
233 | + # to succeed later (and have a failure count of zero). Or |
234 | + # it may fail yet end up with a lower failure count than you |
235 | + # saw earlier. |
236 | d = self.builder.findAndStartJob() |
237 | |
238 | def job_started(candidate): |
239 | if self.builder.currentjob is not None: |
240 | # After a successful dispatch we can reset the |
241 | # failure_count. |
242 | - self.builder.resetFailureCount() |
243 | transaction.commit() |
244 | + with DatabaseTransactionPolicy(read_only=False): |
245 | + self.builder.resetFailureCount() |
246 | + transaction.commit() |
247 | return self.builder.slave |
248 | else: |
249 | return None |
250 | @@ -372,6 +386,7 @@ |
251 | self.logger = self._setupLogger() |
252 | self.new_builders_scanner = NewBuildersScanner( |
253 | manager=self, clock=clock) |
254 | + self.transaction_policy = DatabaseTransactionPolicy(read_only=True) |
255 | |
256 | def _setupLogger(self): |
257 | """Set up a 'slave-scanner' logger that redirects to twisted. |
258 | @@ -390,16 +405,28 @@ |
259 | logger.setLevel(level) |
260 | return logger |
261 | |
262 | + def enterReadOnlyDatabasePolicy(self): |
263 | + """Set the database transaction policy to read-only. |
264 | + |
265 | + Any previously pending changes are committed first. |
266 | + """ |
267 | + transaction.commit() |
268 | + self.transaction_policy.__enter__() |
269 | + |
270 | + def exitReadOnlyDatabasePolicy(self, *args): |
271 | + """Reset database transaction policy to the default read-write.""" |
272 | + self.transaction_policy.__exit__(None, None, None) |
273 | + |
274 | def startService(self): |
275 | """Service entry point, called when the application starts.""" |
276 | + # Avoiding circular imports. |
277 | + from lp.buildmaster.interfaces.builder import IBuilderSet |
278 | + |
279 | + self.enterReadOnlyDatabasePolicy() |
280 | |
281 | # Get a list of builders and set up scanners on each one. |
282 | - |
283 | - # Avoiding circular imports. |
284 | - from lp.buildmaster.interfaces.builder import IBuilderSet |
285 | - builder_set = getUtility(IBuilderSet) |
286 | - builders = [builder.name for builder in builder_set] |
287 | - self.addScanForBuilders(builders) |
288 | + self.addScanForBuilders( |
289 | + [builder.name for builder in getUtility(IBuilderSet)]) |
290 | self.new_builders_scanner.scheduleScan() |
291 | |
292 | # Events will now fire in the SlaveScanner objects to scan each |
293 | @@ -420,6 +447,7 @@ |
294 | # stopped, so we can wait on them all at once here before |
295 | # exiting. |
296 | d = defer.DeferredList(deferreds, consumeErrors=True) |
297 | + d.addCallback(self.exitReadOnlyDatabasePolicy) |
298 | return d |
299 | |
300 | def addScanForBuilders(self, builders): |
301 | |
302 | === modified file 'lib/lp/buildmaster/model/builder.py' |
303 | --- lib/lp/buildmaster/model/builder.py 2011-11-23 12:02:59 +0000 |
304 | +++ lib/lp/buildmaster/model/builder.py 2011-12-08 16:19:26 +0000 |
305 | @@ -1,4 +1,4 @@ |
306 | -# Copyright 2009,2011 Canonical Ltd. This software is licensed under the |
307 | +# Copyright 2009-2011 Canonical Ltd. This software is licensed under the |
308 | # GNU Affero General Public License version 3 (see the file LICENSE). |
309 | |
310 | # pylint: disable-msg=E0611,W0212 |
311 | @@ -76,6 +76,7 @@ |
312 | specific_job_classes, |
313 | ) |
314 | from lp.registry.interfaces.person import validate_public_person |
315 | +from lp.services.database.transaction_policy import DatabaseTransactionPolicy |
316 | from lp.services.job.interfaces.job import JobStatus |
317 | from lp.services.job.model.job import Job |
318 | from lp.services.propertycache import ( |
319 | @@ -545,6 +546,8 @@ |
320 | |
321 | def setSlaveForTesting(self, proxy): |
322 | """See IBuilder.""" |
323 | + # XXX JeroenVermeulen 2011-11-09, bug=888010: Don't use this. |
324 | + # It's a trap. See bug for details. |
325 | self._testing_slave = proxy |
326 | del get_property_cache(self).slave |
327 | |
328 | @@ -673,10 +676,13 @@ |
329 | bytes_written = out_file.tell() |
330 | out_file.seek(0) |
331 | |
332 | - library_file = getUtility(ILibraryFileAliasSet).create( |
333 | - filename, bytes_written, out_file, |
334 | - contentType=filenameToContentType(filename), |
335 | - restricted=private) |
336 | + transaction.commit() |
337 | + with DatabaseTransactionPolicy(read_only=False): |
338 | + library_file = getUtility(ILibraryFileAliasSet).create( |
339 | + filename, bytes_written, out_file, |
340 | + contentType=filenameToContentType(filename), |
341 | + restricted=private) |
342 | + transaction.commit() |
343 | finally: |
344 | # Remove the temporary file. getFile() closes the file |
345 | # object. |
346 | @@ -714,7 +720,7 @@ |
347 | def acquireBuildCandidate(self): |
348 | """Acquire a build candidate in an atomic fashion. |
349 | |
350 | - When retrieiving a candidate we need to mark it as building |
351 | + When retrieving a candidate we need to mark it as building |
352 | immediately so that it is not dispatched by another builder in the |
353 | build manager. |
354 | |
355 | @@ -724,12 +730,15 @@ |
356 | can be in this code at the same time. |
357 | |
358 | If there's ever more than one build manager running at once, then |
359 | - this code will need some sort of mutex. |
360 | + this code will need some sort of mutex, or run in a single |
361 | + transaction. |
362 | """ |
363 | candidate = self._findBuildCandidate() |
364 | if candidate is not None: |
365 | - candidate.markAsBuilding(self) |
366 | transaction.commit() |
367 | + with DatabaseTransactionPolicy(read_only=False): |
368 | + candidate.markAsBuilding(self) |
369 | + transaction.commit() |
370 | return candidate |
371 | |
372 | def _findBuildCandidate(self): |
373 | |
374 | === modified file 'lib/lp/buildmaster/model/buildfarmjobbehavior.py' |
375 | --- lib/lp/buildmaster/model/buildfarmjobbehavior.py 2011-06-09 10:50:25 +0000 |
376 | +++ lib/lp/buildmaster/model/buildfarmjobbehavior.py 2011-12-08 16:19:26 +0000 |
377 | @@ -1,4 +1,4 @@ |
378 | -# Copyright 2009-2010 Canonical Ltd. This software is licensed under the |
379 | +# Copyright 2009-2011 Canonical Ltd. This software is licensed under the |
380 | # GNU Affero General Public License version 3 (see the file LICENSE). |
381 | |
382 | # pylint: disable-msg=E0211,E0213 |
383 | @@ -16,8 +16,8 @@ |
384 | import socket |
385 | import xmlrpclib |
386 | |
387 | +import transaction |
388 | from twisted.internet import defer |
389 | - |
390 | from zope.component import getUtility |
391 | from zope.interface import implements |
392 | from zope.security.proxy import removeSecurityProxy |
393 | @@ -32,6 +32,7 @@ |
394 | IBuildFarmJobBehavior, |
395 | ) |
396 | from lp.services import encoding |
397 | +from lp.services.database.transaction_policy import DatabaseTransactionPolicy |
398 | from lp.services.job.interfaces.job import JobStatus |
399 | |
400 | |
401 | @@ -70,6 +71,25 @@ |
402 | if slave_build_cookie != expected_cookie: |
403 | raise CorruptBuildCookie("Invalid slave build cookie.") |
404 | |
405 | + def _getBuilderStatusHandler(self, status_text, logger): |
406 | + """Look up the handler method for a given builder status. |
407 | + |
408 | + If status is not a known one, logs an error and returns None. |
409 | + """ |
410 | + builder_status_handlers = { |
411 | + 'BuilderStatus.IDLE': self.updateBuild_IDLE, |
412 | + 'BuilderStatus.BUILDING': self.updateBuild_BUILDING, |
413 | + 'BuilderStatus.ABORTING': self.updateBuild_ABORTING, |
414 | + 'BuilderStatus.ABORTED': self.updateBuild_ABORTED, |
415 | + 'BuilderStatus.WAITING': self.updateBuild_WAITING, |
416 | + } |
417 | + handler = builder_status_handlers.get(status_text) |
418 | + if handler is None: |
419 | + logger.critical( |
420 | + "Builder on %s returned unknown status %s; failing it.", |
421 | + self._builder.url, status_text) |
422 | + return handler |
423 | + |
424 | def updateBuild(self, queueItem): |
425 | """See `IBuildFarmJobBehavior`.""" |
426 | logger = logging.getLogger('slave-scanner') |
427 | @@ -77,6 +97,7 @@ |
428 | d = self._builder.slaveStatus() |
429 | |
430 | def got_failure(failure): |
431 | + transaction.abort() |
432 | failure.trap(xmlrpclib.Fault, socket.error) |
433 | info = failure.value |
434 | info = ("Could not contact the builder %s, caught a (%s)" |
435 | @@ -84,27 +105,22 @@ |
436 | raise BuildSlaveFailure(info) |
437 | |
438 | def got_status(slave_status): |
439 | - builder_status_handlers = { |
440 | - 'BuilderStatus.IDLE': self.updateBuild_IDLE, |
441 | - 'BuilderStatus.BUILDING': self.updateBuild_BUILDING, |
442 | - 'BuilderStatus.ABORTING': self.updateBuild_ABORTING, |
443 | - 'BuilderStatus.ABORTED': self.updateBuild_ABORTED, |
444 | - 'BuilderStatus.WAITING': self.updateBuild_WAITING, |
445 | - } |
446 | - |
447 | builder_status = slave_status['builder_status'] |
448 | - if builder_status not in builder_status_handlers: |
449 | - logger.critical( |
450 | - "Builder on %s returned unknown status %s, failing it" |
451 | - % (self._builder.url, builder_status)) |
452 | - self._builder.failBuilder( |
453 | + status_handler = self._getBuilderStatusHandler( |
454 | + builder_status, logger) |
455 | + if status_handler is None: |
456 | + error = ( |
457 | "Unknown status code (%s) returned from status() probe." |
458 | % builder_status) |
459 | - # XXX: This will leave the build and job in a bad state, but |
460 | - # should never be possible, since our builder statuses are |
461 | - # known. |
462 | - queueItem._builder = None |
463 | - queueItem.setDateStarted(None) |
464 | + transaction.commit() |
465 | + with DatabaseTransactionPolicy(read_only=False): |
466 | + self._builder.failBuilder(error) |
467 | + # XXX: This will leave the build and job in a bad |
468 | + # state, but should never be possible since our |
469 | + # builder statuses are known. |
470 | + queueItem._builder = None |
471 | + queueItem.setDateStarted(None) |
472 | + transaction.commit() |
473 | return |
474 | |
475 | # Since logtail is a xmlrpclib.Binary container and it is |
476 | @@ -114,9 +130,8 @@ |
477 | # will simply remove the proxy. |
478 | logtail = removeSecurityProxy(slave_status.get('logtail')) |
479 | |
480 | - method = builder_status_handlers[builder_status] |
481 | return defer.maybeDeferred( |
482 | - method, queueItem, slave_status, logtail, logger) |
483 | + status_handler, queueItem, slave_status, logtail, logger) |
484 | |
485 | d.addErrback(got_failure) |
486 | d.addCallback(got_status) |
487 | @@ -128,22 +143,32 @@ |
488 | Log this and reset the record. |
489 | """ |
490 | logger.warn( |
491 | - "Builder %s forgot about buildqueue %d -- resetting buildqueue " |
492 | - "record" % (queueItem.builder.url, queueItem.id)) |
493 | - queueItem.reset() |
494 | + "Builder %s forgot about buildqueue %d -- " |
495 | + "resetting buildqueue record.", |
496 | + queueItem.builder.url, queueItem.id) |
497 | + transaction.commit() |
498 | + with DatabaseTransactionPolicy(read_only=False): |
499 | + queueItem.reset() |
500 | + transaction.commit() |
501 | |
502 | def updateBuild_BUILDING(self, queueItem, slave_status, logtail, logger): |
503 | """Build still building, collect the logtail""" |
504 | - if queueItem.job.status != JobStatus.RUNNING: |
505 | - queueItem.job.start() |
506 | - queueItem.logtail = encoding.guess(str(logtail)) |
507 | + transaction.commit() |
508 | + with DatabaseTransactionPolicy(read_only=False): |
509 | + if queueItem.job.status != JobStatus.RUNNING: |
510 | + queueItem.job.start() |
511 | + queueItem.logtail = encoding.guess(str(logtail)) |
512 | + transaction.commit() |
513 | |
514 | def updateBuild_ABORTING(self, queueItem, slave_status, logtail, logger): |
515 | """Build was ABORTED. |
516 | |
517 | Master-side should wait until the slave finish the process correctly. |
518 | """ |
519 | - queueItem.logtail = "Waiting for slave process to be terminated" |
520 | + transaction.commit() |
521 | + with DatabaseTransactionPolicy(read_only=False): |
522 | + queueItem.logtail = "Waiting for slave process to be terminated" |
523 | + transaction.commit() |
524 | |
525 | def updateBuild_ABORTED(self, queueItem, slave_status, logtail, logger): |
526 | """ABORTING process has successfully terminated. |
527 | @@ -151,11 +176,16 @@ |
528 | Clean the builder for another jobs. |
529 | """ |
530 | d = queueItem.builder.cleanSlave() |
531 | + |
532 | def got_cleaned(ignored): |
533 | - queueItem.builder = None |
534 | - if queueItem.job.status != JobStatus.FAILED: |
535 | - queueItem.job.fail() |
536 | - queueItem.specific_job.jobAborted() |
537 | + transaction.commit() |
538 | + with DatabaseTransactionPolicy(read_only=False): |
539 | + queueItem.builder = None |
540 | + if queueItem.job.status != JobStatus.FAILED: |
541 | + queueItem.job.fail() |
542 | + queueItem.specific_job.jobAborted() |
543 | + transaction.commit() |
544 | + |
545 | return d.addCallback(got_cleaned) |
546 | |
547 | def extractBuildStatus(self, slave_status): |
548 | |
549 | === modified file 'lib/lp/buildmaster/model/packagebuild.py' |
550 | --- lib/lp/buildmaster/model/packagebuild.py 2011-08-31 19:41:51 +0000 |
551 | +++ lib/lp/buildmaster/model/packagebuild.py 2011-12-08 16:19:26 +0000 |
552 | @@ -1,4 +1,4 @@ |
553 | -# Copyright 2010 Canonical Ltd. This software is licensed under the |
554 | +# Copyright 2010-2011 Canonical Ltd. This software is licensed under the |
555 | # GNU Affero General Public License version 3 (see the file LICENSE). |
556 | |
557 | __metaclass__ = type |
558 | @@ -9,11 +9,11 @@ |
559 | ] |
560 | |
561 | |
562 | +from cStringIO import StringIO |
563 | import datetime |
564 | import logging |
565 | import os.path |
566 | |
567 | -from cStringIO import StringIO |
568 | from lazr.delegates import delegates |
569 | import pytz |
570 | from storm.expr import Desc |
571 | @@ -24,6 +24,7 @@ |
572 | Storm, |
573 | Unicode, |
574 | ) |
575 | +import transaction |
576 | from zope.component import getUtility |
577 | from zope.interface import ( |
578 | classProvides, |
579 | @@ -43,8 +44,8 @@ |
580 | MAIN_STORE, |
581 | ) |
582 | from lp.buildmaster.enums import ( |
583 | + BuildFarmJobType, |
584 | BuildStatus, |
585 | - BuildFarmJobType, |
586 | ) |
587 | from lp.buildmaster.interfaces.buildfarmjob import IBuildFarmJobSource |
588 | from lp.buildmaster.interfaces.packagebuild import ( |
589 | @@ -57,9 +58,8 @@ |
590 | BuildFarmJobDerived, |
591 | ) |
592 | from lp.buildmaster.model.buildqueue import BuildQueue |
593 | -from lp.registry.interfaces.pocket import ( |
594 | - PackagePublishingPocket, |
595 | - ) |
596 | +from lp.registry.interfaces.pocket import PackagePublishingPocket |
597 | +from lp.services.database.transaction_policy import DatabaseTransactionPolicy |
598 | from lp.soyuz.adapters.archivedependencies import ( |
599 | default_component_dependency_name, |
600 | ) |
601 | @@ -179,19 +179,24 @@ |
602 | def storeBuildInfo(build, librarian, slave_status): |
603 | """See `IPackageBuild`.""" |
604 | def got_log(lfa_id): |
605 | + dependencies = slave_status.get('dependencies') |
606 | + if dependencies is not None: |
607 | + dependencies = unicode(dependencies) |
608 | + |
609 | # log, builder and date_finished are read-only, so we must |
610 | # currently remove the security proxy to set them. |
611 | naked_build = removeSecurityProxy(build) |
612 | - naked_build.log = lfa_id |
613 | - naked_build.builder = build.buildqueue_record.builder |
614 | - # XXX cprov 20060615 bug=120584: Currently buildduration includes |
615 | - # the scanner latency, it should really be asking the slave for |
616 | - # the duration spent building locally. |
617 | - naked_build.date_finished = datetime.datetime.now(pytz.UTC) |
618 | - if slave_status.get('dependencies') is not None: |
619 | - build.dependencies = unicode(slave_status.get('dependencies')) |
620 | - else: |
621 | - build.dependencies = None |
622 | + |
623 | + transaction.commit() |
624 | + with DatabaseTransactionPolicy(read_only=False): |
625 | + naked_build.log = lfa_id |
626 | + naked_build.builder = build.buildqueue_record.builder |
627 | + # XXX cprov 20060615 bug=120584: Currently buildduration |
628 | + # includes the scanner latency. It should really be asking |
629 | + # the slave for the duration spent building locally. |
630 | + naked_build.date_finished = datetime.datetime.now(pytz.UTC) |
631 | + build.dependencies = dependencies |
632 | + transaction.commit() |
633 | |
634 | d = build.getLogFromSlave(build) |
635 | return d.addCallback(got_log) |
636 | @@ -292,22 +297,41 @@ |
637 | |
638 | def handleStatus(self, status, librarian, slave_status): |
639 | """See `IPackageBuild`.""" |
640 | + # Avoid circular imports. |
641 | from lp.buildmaster.manager import BUILDD_MANAGER_LOG_NAME |
642 | + |
643 | logger = logging.getLogger(BUILDD_MANAGER_LOG_NAME) |
644 | send_notification = status in self.ALLOWED_STATUS_NOTIFICATIONS |
645 | method = getattr(self, '_handleStatus_' + status, None) |
646 | if method is None: |
647 | - logger.critical("Unknown BuildStatus '%s' for builder '%s'" |
648 | - % (status, self.buildqueue_record.builder.url)) |
649 | - return |
650 | + logger.critical( |
651 | + "Unknown BuildStatus '%s' for builder '%s'", |
652 | + status, self.buildqueue_record.builder.url) |
653 | + return None |
654 | + |
655 | d = method(librarian, slave_status, logger, send_notification) |
656 | return d |
657 | |
658 | + def _destroy_buildqueue_record(self, unused_arg): |
659 | + """Destroy this build's `BuildQueue` record.""" |
660 | + transaction.commit() |
661 | + with DatabaseTransactionPolicy(read_only=False): |
662 | + self.buildqueue_record.destroySelf() |
663 | + transaction.commit() |
664 | + |
665 | def _release_builder_and_remove_queue_item(self): |
666 | # Release the builder for another job. |
667 | d = self.buildqueue_record.builder.cleanSlave() |
668 | # Remove BuildQueue record. |
669 | - return d.addCallback(lambda x: self.buildqueue_record.destroySelf()) |
670 | + return d.addCallback(self._destroy_buildqueue_record) |
671 | + |
672 | + def _notify_if_appropriate(self, appropriate=True, extra_info=None): |
673 | + """If `appropriate`, call `self.notify` in a write transaction.""" |
674 | + if appropriate: |
675 | + transaction.commit() |
676 | + with DatabaseTransactionPolicy(read_only=False): |
677 | + self.notify(extra_info=extra_info) |
678 | + transaction.commit() |
679 | |
680 | def _handleStatus_OK(self, librarian, slave_status, logger, |
681 | send_notification): |
682 | @@ -323,16 +347,19 @@ |
683 | self.buildqueue_record.specific_job.build.title, |
684 | self.buildqueue_record.builder.name)) |
685 | |
686 | - # If this is a binary package build, discard it if its source is |
687 | - # no longer published. |
688 | + # If this is a binary package build for a source that is no |
689 | + # longer published, discard it. |
690 | if self.build_farm_job_type == BuildFarmJobType.PACKAGEBUILD: |
691 | build = self.buildqueue_record.specific_job.build |
692 | if not build.current_source_publication: |
693 | - build.status = BuildStatus.SUPERSEDED |
694 | + transaction.commit() |
695 | + with DatabaseTransactionPolicy(read_only=False): |
696 | + build.status = BuildStatus.SUPERSEDED |
697 | + transaction.commit() |
698 | return self._release_builder_and_remove_queue_item() |
699 | |
700 | - # Explode before collect a binary that is denied in this |
701 | - # distroseries/pocket |
702 | + # Explode rather than collect a binary that is denied in this |
703 | + # distroseries/pocket. |
704 | if not self.archive.allowUpdatesToReleasePocket(): |
705 | assert self.distro_series.canUploadToPocket(self.pocket), ( |
706 | "%s (%s) can not be built for pocket %s: illegal status" |
707 | @@ -377,18 +404,26 @@ |
708 | # files from the slave. |
709 | if successful_copy_from_slave: |
710 | logger.info( |
711 | - "Gathered %s %d completely. Moving %s to uploader queue." |
712 | - % (self.__class__.__name__, self.id, upload_leaf)) |
713 | + "Gathered %s %d completely. " |
714 | + "Moving %s to uploader queue.", |
715 | + self.__class__.__name__, self.id, upload_leaf) |
716 | target_dir = os.path.join(root, "incoming") |
717 | - self.status = BuildStatus.UPLOADING |
718 | + resulting_status = BuildStatus.UPLOADING |
719 | else: |
720 | logger.warning( |
721 | - "Copy from slave for build %s was unsuccessful.", self.id) |
722 | - self.status = BuildStatus.FAILEDTOUPLOAD |
723 | - if send_notification: |
724 | - self.notify( |
725 | - extra_info='Copy from slave was unsuccessful.') |
726 | + "Copy from slave for build %s was unsuccessful.", |
727 | + self.id) |
728 | target_dir = os.path.join(root, "failed") |
729 | + resulting_status = BuildStatus.FAILEDTOUPLOAD |
730 | + |
731 | + transaction.commit() |
732 | + with DatabaseTransactionPolicy(read_only=False): |
733 | + self.status = resulting_status |
734 | + transaction.commit() |
735 | + |
736 | + if not successful_copy_from_slave: |
737 | + self._notify_if_appropriate( |
738 | + send_notification, "Copy from slave was unsuccessful.") |
739 | |
740 | if not os.path.exists(target_dir): |
741 | os.mkdir(target_dir) |
742 | @@ -396,10 +431,6 @@ |
743 | # Release the builder for another job. |
744 | d = self._release_builder_and_remove_queue_item() |
745 | |
746 | - # Commit so there are no race conditions with archiveuploader |
747 | - # about self.status. |
748 | - Store.of(self).commit() |
749 | - |
750 | # Move the directory used to grab the binaries into |
751 | # the incoming directory so the upload processor never |
752 | # sees half-finished uploads. |
753 | @@ -423,14 +454,15 @@ |
754 | set the job status as FAILEDTOBUILD, store available info and |
755 | remove Buildqueue entry. |
756 | """ |
757 | - self.status = BuildStatus.FAILEDTOBUILD |
758 | + transaction.commit() |
759 | + with DatabaseTransactionPolicy(read_only=False): |
760 | + self.status = BuildStatus.FAILEDTOBUILD |
761 | + transaction.commit() |
762 | |
763 | def build_info_stored(ignored): |
764 | - if send_notification: |
765 | - self.notify() |
766 | + self._notify_if_appropriate(send_notification) |
767 | d = self.buildqueue_record.builder.cleanSlave() |
768 | - return d.addCallback( |
769 | - lambda x: self.buildqueue_record.destroySelf()) |
770 | + return d.addCallback(self._destroy_buildqueue_record) |
771 | |
772 | d = self.storeBuildInfo(self, librarian, slave_status) |
773 | return d.addCallback(build_info_stored) |
774 | @@ -448,11 +480,9 @@ |
775 | def build_info_stored(ignored): |
776 | logger.critical("***** %s is MANUALDEPWAIT *****" |
777 | % self.buildqueue_record.builder.name) |
778 | - if send_notification: |
779 | - self.notify() |
780 | + self._notify_if_appropriate(send_notification) |
781 | d = self.buildqueue_record.builder.cleanSlave() |
782 | - return d.addCallback( |
783 | - lambda x: self.buildqueue_record.destroySelf()) |
784 | + return d.addCallback(self._destroy_buildqueue_record) |
785 | |
786 | d = self.storeBuildInfo(self, librarian, slave_status) |
787 | return d.addCallback(build_info_stored) |
788 | @@ -468,17 +498,24 @@ |
789 | self.status = BuildStatus.CHROOTWAIT |
790 | |
791 | def build_info_stored(ignored): |
792 | - logger.critical("***** %s is CHROOTWAIT *****" % |
793 | - self.buildqueue_record.builder.name) |
794 | - if send_notification: |
795 | - self.notify() |
796 | + logger.critical( |
797 | + "***** %s is CHROOTWAIT *****", |
798 | + self.buildqueue_record.builder.name) |
799 | + |
800 | + self._notify_if_appropriate(send_notification) |
801 | d = self.buildqueue_record.builder.cleanSlave() |
802 | - return d.addCallback( |
803 | - lambda x: self.buildqueue_record.destroySelf()) |
804 | + return d.addCallback(self._destroy_buildqueue_record) |
805 | |
806 | d = self.storeBuildInfo(self, librarian, slave_status) |
807 | return d.addCallback(build_info_stored) |
808 | |
809 | + def _reset_buildqueue_record(self, ignored_arg=None): |
810 | + """Reset the `BuildQueue` record, in a write transaction.""" |
811 | + transaction.commit() |
812 | + with DatabaseTransactionPolicy(read_only=False): |
813 | + self.buildqueue_record.reset() |
814 | + transaction.commit() |
815 | + |
816 | def _handleStatus_BUILDERFAIL(self, librarian, slave_status, logger, |
817 | send_notification): |
818 | """Handle builder failures. |
819 | @@ -492,11 +529,8 @@ |
820 | self.buildqueue_record.builder.failBuilder( |
821 | "Builder returned BUILDERFAIL when asked for its status") |
822 | |
823 | - def build_info_stored(ignored): |
824 | - # simply reset job |
825 | - self.buildqueue_record.reset() |
826 | d = self.storeBuildInfo(self, librarian, slave_status) |
827 | - return d.addCallback(build_info_stored) |
828 | + return d.addCallback(self._reset_buildqueue_record) |
829 | |
830 | def _handleStatus_GIVENBACK(self, librarian, slave_status, logger, |
831 | send_notification): |
832 | @@ -516,7 +550,7 @@ |
833 | # the next Paris Summit, infinity has some ideas about how |
834 | # to use this content. For now we just ensure it's stored. |
835 | d = self.buildqueue_record.builder.cleanSlave() |
836 | - self.buildqueue_record.reset() |
837 | + self._reset_buildqueue_record() |
838 | return d |
839 | |
840 | d = self.storeBuildInfo(self, librarian, slave_status) |
841 | |
842 | === modified file 'lib/lp/buildmaster/tests/test_builder.py' |
843 | --- lib/lp/buildmaster/tests/test_builder.py 2011-11-20 23:37:23 +0000 |
844 | +++ lib/lp/buildmaster/tests/test_builder.py 2011-12-08 16:19:26 +0000 |
845 | @@ -155,7 +155,7 @@ |
846 | d = lostbuilding_builder.updateStatus(BufferLogger()) |
847 | def check_slave_status(failure): |
848 | self.assertIn('abort', slave.call_log) |
849 | - # 'Fault' comes from the LostBuildingBrokenSlave, this is |
850 | + # 'Fault' comes from the LostBuildingBrokenSlave. This is |
851 | # just testing that the value is passed through. |
852 | self.assertIsInstance(failure.value, xmlrpclib.Fault) |
853 | return d.addBoth(check_slave_status) |
854 | |
855 | === modified file 'lib/lp/buildmaster/tests/test_manager.py' |
856 | --- lib/lp/buildmaster/tests/test_manager.py 2011-11-20 23:37:23 +0000 |
857 | +++ lib/lp/buildmaster/tests/test_manager.py 2011-12-08 16:19:26 +0000 |
858 | @@ -1,40 +1,36 @@ |
859 | -# Copyright 2009-2010 Canonical Ltd. This software is licensed under the |
860 | +# Copyright 2009-2011 Canonical Ltd. This software is licensed under the |
861 | # GNU Affero General Public License version 3 (see the file LICENSE). |
862 | |
863 | """Tests for the renovated slave scanner aka BuilddManager.""" |
864 | |
865 | +from collections import namedtuple |
866 | import os |
867 | import signal |
868 | import time |
869 | import xmlrpclib |
870 | |
871 | +from lpbuildd.tests import BuilddSlaveTestSetup |
872 | from testtools.deferredruntest import ( |
873 | assert_fails_with, |
874 | AsynchronousDeferredRunTest, |
875 | ) |
876 | - |
877 | import transaction |
878 | - |
879 | from twisted.internet import ( |
880 | defer, |
881 | reactor, |
882 | task, |
883 | ) |
884 | -from twisted.internet.task import ( |
885 | - deferLater, |
886 | - ) |
887 | +from twisted.internet.task import deferLater |
888 | from twisted.python.failure import Failure |
889 | from zope.component import getUtility |
890 | from zope.security.proxy import removeSecurityProxy |
891 | |
892 | -from lpbuildd.tests import BuilddSlaveTestSetup |
893 | - |
894 | from canonical.config import config |
895 | +from canonical.database.constants import UTC_NOW |
896 | from canonical.launchpad.ftests import ( |
897 | ANONYMOUS, |
898 | login, |
899 | ) |
900 | -from lp.services.log.logger import BufferLogger |
901 | from canonical.testing.layers import ( |
902 | LaunchpadScriptLayer, |
903 | LaunchpadZopelessLayer, |
904 | @@ -50,14 +46,18 @@ |
905 | SlaveScanner, |
906 | ) |
907 | from lp.buildmaster.model.builder import Builder |
908 | +from lp.buildmaster.model.packagebuild import PackageBuild |
909 | from lp.buildmaster.tests.harness import BuilddManagerTestSetup |
910 | from lp.buildmaster.tests.mock_slaves import ( |
911 | BrokenSlave, |
912 | BuildingSlave, |
913 | make_publisher, |
914 | OkSlave, |
915 | + WaitingSlave, |
916 | ) |
917 | from lp.registry.interfaces.distribution import IDistributionSet |
918 | +from lp.services.database.transaction_policy import DatabaseTransactionPolicy |
919 | +from lp.services.log.logger import BufferLogger |
920 | from lp.soyuz.interfaces.binarypackagebuild import IBinaryPackageBuildSet |
921 | from lp.testing import ( |
922 | TestCase, |
923 | @@ -65,10 +65,13 @@ |
924 | ) |
925 | from lp.testing.factory import LaunchpadObjectFactory |
926 | from lp.testing.fakemethod import FakeMethod |
927 | -from lp.testing.sampledata import BOB_THE_BUILDER_NAME |
928 | - |
929 | - |
930 | -class TestSlaveScannerScan(TestCase): |
931 | +from lp.testing.sampledata import ( |
932 | + BOB_THE_BUILDER_NAME, |
933 | + FROG_THE_BUILDER_NAME, |
934 | + ) |
935 | + |
936 | + |
937 | +class TestSlaveScannerScan(TestCaseWithFactory): |
938 | """Tests `SlaveScanner.scan` method. |
939 | |
940 | This method uses the old framework for scanning and dispatching builds. |
941 | @@ -83,6 +86,8 @@ |
942 | 'bob' builder. |
943 | """ |
944 | super(TestSlaveScannerScan, self).setUp() |
945 | + self.read_only = DatabaseTransactionPolicy(read_only=True) |
946 | + |
947 | # Creating the required chroots needed for dispatching. |
948 | test_publisher = make_publisher() |
949 | ubuntu = getUtility(IDistributionSet).getByName('ubuntu') |
950 | @@ -90,6 +95,15 @@ |
951 | test_publisher.setUpDefaultDistroSeries(hoary) |
952 | test_publisher.addFakeChroots() |
953 | |
954 | + def _enterReadOnly(self): |
955 | + """Go into read-only transaction policy.""" |
956 | + self.read_only.__enter__() |
957 | + self.addCleanup(self._exitReadOnly) |
958 | + |
959 | + def _exitReadOnly(self): |
960 | + """Leave read-only transaction policy.""" |
961 | + self.read_only.__exit__(None, None, None) |
962 | + |
963 | def _resetBuilder(self, builder): |
964 | """Reset the given builder and its job.""" |
965 | |
966 | @@ -100,6 +114,23 @@ |
967 | |
968 | transaction.commit() |
969 | |
970 | + def getFreshBuilder(self, slave=None, name=BOB_THE_BUILDER_NAME, |
971 | + failure_count=0): |
972 | + """Return a builder. |
973 | + |
974 | + The builder is taken from sample data, but reset to a usable state. |
975 | + Be careful: this is not a proper factory method. Identical calls |
976 | + return (and reset) the same builder. Don't rely on that though; |
977 | + maybe someday we'll have a proper factory here. |
978 | + """ |
979 | + if slave is None: |
980 | + slave = OkSlave() |
981 | + builder = getUtility(IBuilderSet)[name] |
982 | + self._resetBuilder(builder) |
983 | + builder.setSlaveForTesting(slave) |
984 | + builder.failure_count = failure_count |
985 | + return builder |
986 | + |
987 | def assertBuildingJob(self, job, builder, logtail=None): |
988 | """Assert the given job is building on the given builder.""" |
989 | from lp.services.job.interfaces.job import JobStatus |
990 | @@ -114,14 +145,14 @@ |
991 | self.assertEqual(build.status, BuildStatus.BUILDING) |
992 | self.assertEqual(job.logtail, logtail) |
993 | |
994 | - def _getScanner(self, builder_name=None): |
995 | + def _getScanner(self, builder_name=None, clock=None): |
996 | """Instantiate a SlaveScanner object. |
997 | |
998 | Replace its default logging handler by a testing version. |
999 | """ |
1000 | if builder_name is None: |
1001 | builder_name = BOB_THE_BUILDER_NAME |
1002 | - scanner = SlaveScanner(builder_name, BufferLogger()) |
1003 | + scanner = SlaveScanner(builder_name, BufferLogger(), clock=clock) |
1004 | scanner.logger.name = 'slave-scanner' |
1005 | |
1006 | return scanner |
1007 | @@ -137,17 +168,15 @@ |
1008 | def testScanDispatchForResetBuilder(self): |
1009 | # A job gets dispatched to the sampledata builder after it's reset. |
1010 | |
1011 | - # Reset sampledata builder. |
1012 | - builder = getUtility(IBuilderSet)[BOB_THE_BUILDER_NAME] |
1013 | - self._resetBuilder(builder) |
1014 | - builder.setSlaveForTesting(OkSlave()) |
1015 | - # Set this to 1 here so that _checkDispatch can make sure it's |
1016 | - # reset to 0 after a successful dispatch. |
1017 | - builder.failure_count = 1 |
1018 | + # Obtain a builder. Initialize failure count to 1 so that |
1019 | + # _checkDispatch can make sure that a successful dispatch resets |
1020 | + # the count to 0. |
1021 | + builder = self.getFreshBuilder(failure_count=1) |
1022 | |
1023 | # Run 'scan' and check its result. |
1024 | self.layer.txn.commit() |
1025 | self.layer.switchDbUser(config.builddmaster.dbuser) |
1026 | + self._enterReadOnly() |
1027 | scanner = self._getScanner() |
1028 | d = defer.maybeDeferred(scanner.scan) |
1029 | d.addCallback(self._checkDispatch, builder) |
1030 | @@ -160,20 +189,18 @@ |
1031 | to the asynchonous dispatcher and the builder remained active |
1032 | and IDLE. |
1033 | """ |
1034 | - self.assertTrue(slave is None, "Unexpected slave.") |
1035 | + self.assertIs(None, slave, "Unexpected slave.") |
1036 | |
1037 | builder = getUtility(IBuilderSet).get(builder.id) |
1038 | self.assertTrue(builder.builderok) |
1039 | - self.assertTrue(builder.currentjob is None) |
1040 | + self.assertIs(None, builder.currentjob) |
1041 | |
1042 | def testNoDispatchForMissingChroots(self): |
1043 | # When a required chroot is not present the `scan` method |
1044 | # should not return any `RecordingSlaves` to be processed |
1045 | # and the builder used should remain active and IDLE. |
1046 | |
1047 | - # Reset sampledata builder. |
1048 | - builder = getUtility(IBuilderSet)[BOB_THE_BUILDER_NAME] |
1049 | - self._resetBuilder(builder) |
1050 | + builder = self.getFreshBuilder() |
1051 | |
1052 | # Remove hoary/i386 chroot. |
1053 | login('foo.bar@canonical.com') |
1054 | @@ -186,6 +213,7 @@ |
1055 | |
1056 | # Run 'scan' and check its result. |
1057 | self.layer.switchDbUser(config.builddmaster.dbuser) |
1058 | + self._enterReadOnly() |
1059 | scanner = self._getScanner() |
1060 | d = defer.maybeDeferred(scanner.singleCycle) |
1061 | d.addCallback(self._checkNoDispatch, builder) |
1062 | @@ -227,6 +255,7 @@ |
1063 | |
1064 | # Run 'scan' and check its result. |
1065 | self.layer.switchDbUser(config.builddmaster.dbuser) |
1066 | + self._enterReadOnly() |
1067 | scanner = self._getScanner() |
1068 | d = defer.maybeDeferred(scanner.scan) |
1069 | d.addCallback(self._checkJobRescued, builder, job) |
1070 | @@ -262,25 +291,27 @@ |
1071 | |
1072 | # Run 'scan' and check its result. |
1073 | self.layer.switchDbUser(config.builddmaster.dbuser) |
1074 | + self._enterReadOnly() |
1075 | scanner = self._getScanner() |
1076 | d = defer.maybeDeferred(scanner.scan) |
1077 | d.addCallback(self._checkJobUpdated, builder, job) |
1078 | return d |
1079 | |
1080 | def test_scan_with_nothing_to_dispatch(self): |
1081 | - factory = LaunchpadObjectFactory() |
1082 | - builder = factory.makeBuilder() |
1083 | + builder = self.factory.makeBuilder() |
1084 | builder.setSlaveForTesting(OkSlave()) |
1085 | + transaction.commit() |
1086 | + self._enterReadOnly() |
1087 | scanner = self._getScanner(builder_name=builder.name) |
1088 | d = scanner.scan() |
1089 | return d.addCallback(self._checkNoDispatch, builder) |
1090 | |
1091 | def test_scan_with_manual_builder(self): |
1092 | # Reset sampledata builder. |
1093 | - builder = getUtility(IBuilderSet)[BOB_THE_BUILDER_NAME] |
1094 | - self._resetBuilder(builder) |
1095 | - builder.setSlaveForTesting(OkSlave()) |
1096 | + builder = self.getFreshBuilder() |
1097 | builder.manual = True |
1098 | + transaction.commit() |
1099 | + self._enterReadOnly() |
1100 | scanner = self._getScanner() |
1101 | d = scanner.scan() |
1102 | d.addCallback(self._checkNoDispatch, builder) |
1103 | @@ -288,10 +319,10 @@ |
1104 | |
1105 | def test_scan_with_not_ok_builder(self): |
1106 | # Reset sampledata builder. |
1107 | - builder = getUtility(IBuilderSet)[BOB_THE_BUILDER_NAME] |
1108 | - self._resetBuilder(builder) |
1109 | - builder.setSlaveForTesting(OkSlave()) |
1110 | + builder = self.getFreshBuilder() |
1111 | builder.builderok = False |
1112 | + transaction.commit() |
1113 | + self._enterReadOnly() |
1114 | scanner = self._getScanner() |
1115 | d = scanner.scan() |
1116 | # Because the builder is not ok, we can't use _checkNoDispatch. |
1117 | @@ -300,25 +331,27 @@ |
1118 | return d |
1119 | |
1120 | def test_scan_of_broken_slave(self): |
1121 | - builder = getUtility(IBuilderSet)[BOB_THE_BUILDER_NAME] |
1122 | - self._resetBuilder(builder) |
1123 | - builder.setSlaveForTesting(BrokenSlave()) |
1124 | - builder.failure_count = 0 |
1125 | + builder = self.getFreshBuilder(slave=BrokenSlave()) |
1126 | + transaction.commit() |
1127 | + self._enterReadOnly() |
1128 | scanner = self._getScanner(builder_name=builder.name) |
1129 | d = scanner.scan() |
1130 | return assert_fails_with(d, xmlrpclib.Fault) |
1131 | |
1132 | def _assertFailureCounting(self, builder_count, job_count, |
1133 | expected_builder_count, expected_job_count): |
1134 | + # Avoid circular imports. |
1135 | + from lp.buildmaster import manager as manager_module |
1136 | + |
1137 | # If scan() fails with an exception, failure_counts should be |
1138 | # incremented. What we do with the results of the failure |
1139 | # counts is tested below separately, this test just makes sure that |
1140 | # scan() is setting the counts. |
1141 | def failing_scan(): |
1142 | return defer.fail(Exception("fake exception")) |
1143 | + |
1144 | scanner = self._getScanner() |
1145 | scanner.scan = failing_scan |
1146 | - from lp.buildmaster import manager as manager_module |
1147 | self.patch(manager_module, 'assessFailureCounts', FakeMethod()) |
1148 | builder = getUtility(IBuilderSet)[scanner.builder_name] |
1149 | |
1150 | @@ -466,6 +499,60 @@ |
1151 | d.addCallback(check_cancelled, builder, buildqueue) |
1152 | return d |
1153 | |
1154 | + def makeFakeFailure(self): |
1155 | + """Produce a fake failure for use with SlaveScanner._scanFailed.""" |
1156 | + FakeFailure = namedtuple('FakeFailure', ['getErrorMessage', 'check']) |
1157 | + return FakeFailure( |
1158 | + FakeMethod(self.factory.getUniqueString()), |
1159 | + FakeMethod(True)) |
1160 | + |
1161 | + def test_interleaved_success_and_failure_do_not_interfere(self): |
1162 | + # It's possible for one builder to fail while another continues |
1163 | + # to function properly. When that happens, the failed builder |
1164 | + # may cause database changes to be rolled back. But that does |
1165 | + # not affect the functioning builder. |
1166 | + clock = task.Clock() |
1167 | + |
1168 | + broken_builder = self.getFreshBuilder( |
1169 | + slave=BrokenSlave(), name=BOB_THE_BUILDER_NAME) |
1170 | + broken_scanner = self._getScanner(builder_name=broken_builder.name) |
1171 | + good_builder = self.getFreshBuilder( |
1172 | + slave=WaitingSlave(), name=FROG_THE_BUILDER_NAME) |
1173 | + good_build = self.factory.makeBinaryPackageBuild( |
1174 | + distroarchseries=self.factory.makeDistroArchSeries()) |
1175 | + |
1176 | + # The good build is being handled by the good builder. |
1177 | + buildqueue = good_build.queueBuild() |
1178 | + buildqueue.builder = good_builder |
1179 | + |
1180 | + removeSecurityProxy(good_build.build_farm_job).date_started = UTC_NOW |
1181 | + |
1182 | + # The good builder requests information from a successful build, |
1183 | + # and up receiving it, updates the build's metadata. |
1184 | + # Our dependencies string goes into the build, and its |
1185 | + # date_finished will be set. |
1186 | + dependencies = self.factory.getUniqueString() |
1187 | + PackageBuild.storeBuildInfo( |
1188 | + good_build, None, {'dependencies': dependencies}) |
1189 | + clock.advance(1) |
1190 | + |
1191 | + # The broken scanner experiences a failure before the good |
1192 | + # scanner is receiving its data. This aborts the ongoing |
1193 | + # transaction. |
1194 | + # As a somewhat weird example, if the builder changed its own |
1195 | + # title, that change will be rolled back. |
1196 | + original_broken_builder_title = broken_builder.title |
1197 | + broken_builder.title = self.factory.getUniqueString() |
1198 | + broken_scanner._scanFailed(self.makeFakeFailure()) |
1199 | + |
1200 | + # The work done by the good scanner is retained. The |
1201 | + # storeBuildInfo code committed it. |
1202 | + self.assertEqual(dependencies, good_build.dependencies) |
1203 | + self.assertIsNot(None, good_build.date_finished) |
1204 | + |
1205 | + # The work done by the broken scanner is rolled back. |
1206 | + self.assertEqual(original_broken_builder_title, broken_builder.title) |
1207 | + |
1208 | |
1209 | class TestCancellationChecking(TestCaseWithFactory): |
1210 | """Unit tests for the checkCancellation method.""" |
1211 | |
1212 | === modified file 'lib/lp/buildmaster/tests/test_packagebuild.py' |
1213 | --- lib/lp/buildmaster/tests/test_packagebuild.py 2011-09-10 08:59:08 +0000 |
1214 | +++ lib/lp/buildmaster/tests/test_packagebuild.py 2011-12-08 16:19:26 +0000 |
1215 | @@ -1,4 +1,4 @@ |
1216 | -# Copyright 2010 Canonical Ltd. This software is licensed under the |
1217 | +# Copyright 2010-2011 Canonical Ltd. This software is licensed under the |
1218 | # GNU Affero General Public License version 3 (see the file LICENSE). |
1219 | |
1220 | """Tests for `IPackageBuild`.""" |
1221 | @@ -22,9 +22,7 @@ |
1222 | LaunchpadFunctionalLayer, |
1223 | LaunchpadZopelessLayer, |
1224 | ) |
1225 | -from lp.archiveuploader.uploadprocessor import ( |
1226 | - parse_build_upload_leaf_name, |
1227 | - ) |
1228 | +from lp.archiveuploader.uploadprocessor import parse_build_upload_leaf_name |
1229 | from lp.buildmaster.enums import ( |
1230 | BuildFarmJobType, |
1231 | BuildStatus, |
1232 | @@ -34,12 +32,11 @@ |
1233 | IPackageBuildSet, |
1234 | IPackageBuildSource, |
1235 | ) |
1236 | +from lp.buildmaster.model.builder import BuilderSlave |
1237 | from lp.buildmaster.model.buildfarmjob import BuildFarmJob |
1238 | from lp.buildmaster.model.packagebuild import PackageBuild |
1239 | from lp.buildmaster.tests.mock_slaves import WaitingSlave |
1240 | -from lp.registry.interfaces.pocket import ( |
1241 | - PackagePublishingPocket, |
1242 | - ) |
1243 | +from lp.registry.interfaces.pocket import PackagePublishingPocket |
1244 | from lp.testing import ( |
1245 | login, |
1246 | login_person, |
1247 | @@ -285,10 +282,7 @@ |
1248 | |
1249 | |
1250 | class TestHandleStatusMixin: |
1251 | - """Tests for `IPackageBuild`s handleStatus method. |
1252 | - |
1253 | - This should be run with a Trial TestCase. |
1254 | - """ |
1255 | + """Tests for `IPackageBuild`s handleStatus method.""" |
1256 | |
1257 | layer = LaunchpadZopelessLayer |
1258 | |
1259 | @@ -307,7 +301,7 @@ |
1260 | self.build.buildqueue_record.setDateStarted(UTC_NOW) |
1261 | self.slave = WaitingSlave('BuildStatus.OK') |
1262 | self.slave.valid_file_hashes.append('test_file_hash') |
1263 | - builder.setSlaveForTesting(self.slave) |
1264 | + self.patch(BuilderSlave, 'makeBuilderSlave', FakeMethod(self.slave)) |
1265 | |
1266 | # We overwrite the buildmaster root to use a temp directory. |
1267 | tempdir = tempfile.mkdtemp() |
1268 | @@ -348,7 +342,7 @@ |
1269 | def got_status(ignored): |
1270 | self.assertEqual(BuildStatus.FAILEDTOUPLOAD, self.build.status) |
1271 | self.assertResultCount(0, "failed") |
1272 | - self.assertIdentical(None, self.build.buildqueue_record) |
1273 | + self.assertIs(None, self.build.buildqueue_record) |
1274 | |
1275 | d = self.build.handleStatus('OK', None, { |
1276 | 'filemap': {'/tmp/myfile.py': 'test_file_hash'}, |
1277 | @@ -390,14 +384,10 @@ |
1278 | |
1279 | def got_status(ignored): |
1280 | if expected_notification: |
1281 | - self.failIf( |
1282 | - len(pop_notifications()) == 0, |
1283 | - "No notifications received") |
1284 | + self.assertNotEqual( |
1285 | + 0, len(pop_notifications()), "No notifications received.") |
1286 | else: |
1287 | - self.failIf( |
1288 | - len(pop_notifications()) > 0, |
1289 | - "Notifications received") |
1290 | - |
1291 | + self.assertContentEqual([], pop_notifications()) |
1292 | d = self.build.handleStatus(status, None, {}) |
1293 | return d.addCallback(got_status) |
1294 | |
1295 | |
1296 | === modified file 'lib/lp/code/model/tests/test_sourcepackagerecipebuild.py' |
1297 | --- lib/lp/code/model/tests/test_sourcepackagerecipebuild.py 2011-11-09 11:36:44 +0000 |
1298 | +++ lib/lp/code/model/tests/test_sourcepackagerecipebuild.py 2011-12-08 16:19:26 +0000 |
1299 | @@ -13,8 +13,8 @@ |
1300 | |
1301 | from pytz import utc |
1302 | from storm.locals import Store |
1303 | +from testtools.deferredruntest import AsynchronousDeferredRunTest |
1304 | import transaction |
1305 | -from twisted.trial.unittest import TestCase as TrialTestCase |
1306 | from zope.component import getUtility |
1307 | from zope.security.proxy import removeSecurityProxy |
1308 | |
1309 | @@ -28,6 +28,7 @@ |
1310 | from lp.app.errors import NotFoundError |
1311 | from lp.buildmaster.enums import BuildStatus |
1312 | from lp.buildmaster.interfaces.buildqueue import IBuildQueue |
1313 | +from lp.buildmaster.model.builder import BuilderSlave |
1314 | from lp.buildmaster.model.buildfarmjob import BuildFarmJob |
1315 | from lp.buildmaster.model.packagebuild import PackageBuild |
1316 | from lp.buildmaster.tests.mock_slaves import WaitingSlave |
1317 | @@ -588,14 +589,11 @@ |
1318 | self.assertEquals(0, len(notifications)) |
1319 | |
1320 | |
1321 | -class TestBuildNotifications(TrialTestCase): |
1322 | +class TestBuildNotifications(TestCaseWithFactory): |
1323 | |
1324 | layer = LaunchpadZopelessLayer |
1325 | |
1326 | - def setUp(self): |
1327 | - super(TestBuildNotifications, self).setUp() |
1328 | - from lp.testing.factory import LaunchpadObjectFactory |
1329 | - self.factory = LaunchpadObjectFactory() |
1330 | + run_tests_with = AsynchronousDeferredRunTest.make_factory(timeout=20) |
1331 | |
1332 | def prepare_build(self, fake_successful_upload=False): |
1333 | queue_record = self.factory.makeSourcePackageRecipeBuildJob() |
1334 | @@ -608,7 +606,7 @@ |
1335 | result=True) |
1336 | queue_record.builder = self.factory.makeBuilder() |
1337 | slave = WaitingSlave('BuildStatus.OK') |
1338 | - queue_record.builder.setSlaveForTesting(slave) |
1339 | + self.patch(BuilderSlave, 'makeBuilderSlave', FakeMethod(slave)) |
1340 | return build |
1341 | |
1342 | def assertDeferredNotifyCount(self, status, build, expected_count): |
1343 | @@ -666,5 +664,5 @@ |
1344 | |
1345 | |
1346 | class TestHandleStatusForSPRBuild( |
1347 | - MakeSPRecipeBuildMixin, TestHandleStatusMixin, TrialTestCase): |
1348 | + MakeSPRecipeBuildMixin, TestHandleStatusMixin, TestCaseWithFactory): |
1349 | """IPackageBuild.handleStatus works with SPRecipe builds.""" |
1350 | |
1351 | === modified file 'lib/lp/services/database/transaction_policy.py' |
1352 | --- lib/lp/services/database/transaction_policy.py 2011-09-29 05:49:18 +0000 |
1353 | +++ lib/lp/services/database/transaction_policy.py 2011-12-08 16:19:26 +0000 |
1354 | @@ -133,8 +133,11 @@ |
1355 | def _isInTransaction(self): |
1356 | """Is our store currently in a transaction?""" |
1357 | pg_connection = self.store._connection._raw_connection |
1358 | - status = pg_connection.get_transaction_status() |
1359 | - return status != TRANSACTION_STATUS_IDLE |
1360 | + if pg_connection is None: |
1361 | + return False |
1362 | + else: |
1363 | + status = pg_connection.get_transaction_status() |
1364 | + return status != TRANSACTION_STATUS_IDLE |
1365 | |
1366 | def _checkNoTransaction(self, error_msg): |
1367 | """Verify that no transaction is ongoing. |
1368 | |
1369 | === modified file 'lib/lp/soyuz/tests/test_binarypackagebuild.py' |
1370 | --- lib/lp/soyuz/tests/test_binarypackagebuild.py 2011-11-09 11:36:44 +0000 |
1371 | +++ lib/lp/soyuz/tests/test_binarypackagebuild.py 2011-12-08 16:19:26 +0000 |
1372 | @@ -1,4 +1,4 @@ |
1373 | -# Copyright 2009-2010 Canonical Ltd. This software is licensed under the |
1374 | +# Copyright 2009-2011 Canonical Ltd. This software is licensed under the |
1375 | # GNU Affero General Public License version 3 (see the file LICENSE). |
1376 | |
1377 | """Test Build features.""" |
1378 | @@ -10,7 +10,7 @@ |
1379 | |
1380 | import pytz |
1381 | from storm.store import Store |
1382 | -from twisted.trial.unittest import TestCase as TrialTestCase |
1383 | +from testtools.deferredruntest import AsynchronousDeferredRunTest |
1384 | from zope.component import getUtility |
1385 | from zope.security.proxy import removeSecurityProxy |
1386 | |
1387 | @@ -25,6 +25,7 @@ |
1388 | from lp.buildmaster.interfaces.builder import IBuilderSet |
1389 | from lp.buildmaster.interfaces.buildqueue import IBuildQueue |
1390 | from lp.buildmaster.interfaces.packagebuild import IPackageBuild |
1391 | +from lp.buildmaster.model.builder import BuilderSlave |
1392 | from lp.buildmaster.model.buildqueue import BuildQueue |
1393 | from lp.buildmaster.tests.mock_slaves import WaitingSlave |
1394 | from lp.buildmaster.tests.test_packagebuild import ( |
1395 | @@ -53,6 +54,7 @@ |
1396 | logout, |
1397 | TestCaseWithFactory, |
1398 | ) |
1399 | +from lp.testing.fakemethod import FakeMethod |
1400 | |
1401 | |
1402 | class TestBinaryPackageBuild(TestCaseWithFactory): |
1403 | @@ -522,7 +524,9 @@ |
1404 | self.build = gedit_src_hist.createMissingBuilds()[0] |
1405 | |
1406 | self.builder = self.factory.makeBuilder() |
1407 | - self.builder.setSlaveForTesting(WaitingSlave('BuildStatus.OK')) |
1408 | + self.patch( |
1409 | + BuilderSlave, 'makeBuilderSlave', |
1410 | + FakeMethod(WaitingSlave('BuildStatus.OK'))) |
1411 | self.build.buildqueue_record.markAsBuilding(self.builder) |
1412 | |
1413 | def testDependencies(self): |
1414 | @@ -568,9 +572,12 @@ |
1415 | |
1416 | |
1417 | class TestHandleStatusForBinaryPackageBuild( |
1418 | - MakeBinaryPackageBuildMixin, TestHandleStatusMixin, TrialTestCase): |
1419 | + MakeBinaryPackageBuildMixin, TestHandleStatusMixin, TestCaseWithFactory): |
1420 | """IPackageBuild.handleStatus works with binary builds.""" |
1421 | |
1422 | + layer = LaunchpadZopelessLayer |
1423 | + run_tests_with = AsynchronousDeferredRunTest.make_factory(timeout=20) |
1424 | + |
1425 | |
1426 | class TestBinaryPackageBuildWebservice(TestCaseWithFactory): |
1427 | """Test cases for BinaryPackageBuild on the webservice. |
1428 | |
1429 | === modified file 'lib/lp/translations/model/translationtemplatesbuildbehavior.py' |
1430 | --- lib/lp/translations/model/translationtemplatesbuildbehavior.py 2011-10-19 08:00:29 +0000 |
1431 | +++ lib/lp/translations/model/translationtemplatesbuildbehavior.py 2011-12-08 16:19:26 +0000 |
1432 | @@ -1,4 +1,4 @@ |
1433 | -# Copyright 2010 Canonical Ltd. This software is licensed under the |
1434 | +# Copyright 2010-2011 Canonical Ltd. This software is licensed under the |
1435 | # GNU Affero General Public License version 3 (see the file LICENSE). |
1436 | |
1437 | """An `IBuildFarmJobBehavior` for `TranslationTemplatesBuildJob`. |
1438 | @@ -13,9 +13,10 @@ |
1439 | |
1440 | import datetime |
1441 | import os |
1442 | +import tempfile |
1443 | + |
1444 | import pytz |
1445 | -import tempfile |
1446 | - |
1447 | +import transaction |
1448 | from twisted.internet import defer |
1449 | from zope.component import getUtility |
1450 | from zope.interface import implements |
1451 | @@ -28,6 +29,7 @@ |
1452 | ) |
1453 | from lp.buildmaster.model.buildfarmjobbehavior import BuildFarmJobBehaviorBase |
1454 | from lp.registry.interfaces.productseries import IProductSeriesSet |
1455 | +from lp.services.database.transaction_policy import DatabaseTransactionPolicy |
1456 | from lp.translations.interfaces.translationimportqueue import ( |
1457 | ITranslationImportQueue, |
1458 | ) |
1459 | @@ -132,13 +134,16 @@ |
1460 | def storeBuildInfo(build, queue_item, build_status): |
1461 | """See `IPackageBuild`.""" |
1462 | def got_log(lfa_id): |
1463 | - build.build.log = lfa_id |
1464 | - build.build.builder = queue_item.builder |
1465 | - build.build.date_started = queue_item.date_started |
1466 | - # XXX cprov 20060615 bug=120584: Currently buildduration includes |
1467 | - # the scanner latency, it should really be asking the slave for |
1468 | - # the duration spent building locally. |
1469 | - build.build.date_finished = datetime.datetime.now(pytz.UTC) |
1470 | + transaction.commit() |
1471 | + with DatabaseTransactionPolicy(read_only=False): |
1472 | + build.build.log = lfa_id |
1473 | + build.build.builder = queue_item.builder |
1474 | + build.build.date_started = queue_item.date_started |
1475 | + # XXX cprov 20060615 bug=120584: Currently buildduration |
1476 | + # includes the scanner latency. It should really be |
1477 | + # asking the slave for the duration spent building locally. |
1478 | + build.build.date_finished = datetime.datetime.now(pytz.UTC) |
1479 | + transaction.commit() |
1480 | |
1481 | d = build.getLogFromSlave(build, queue_item) |
1482 | return d.addCallback(got_log) |
Please also link bug 785679 to this MP, you've fixed it too since you've re-enabled the test.
This broadly looks okay, although I will admit to a large sense of fear and trepidation of this hitting production. You're absolutely right, though, the buildd-manager needs more tests and less magic.
I am wary of the amount of times you are calling transaction. commit( ) in this branch. Especially in tests, where you are calling into methods that could commit() first and avoid all that duplication.
I am holding off my vote for now, this branch requires some thought. And possibly a large amount of vodka.