Merge lp:~adeuring/launchpad/abort-transaction-in-job-queue into lp:launchpad
| Status: | Merged | ||||
|---|---|---|---|---|---|
| Approved by: | Abel Deuring on 2012-04-13 | ||||
| Approved revision: | no longer in the source branch. | ||||
| Merged at revision: | 15095 | ||||
| Proposed branch: | lp:~adeuring/launchpad/abort-transaction-in-job-queue | ||||
| Merge into: | lp:launchpad | ||||
| Diff against target: |
177 lines (+118/-7) 3 files modified
lib/lp/services/job/interfaces/job.py (+5/-5) lib/lp/services/job/model/job.py (+4/-2) lib/lp/services/job/tests/test_job.py (+109/-0) |
||||
| To merge this branch: | bzr merge lp:~adeuring/launchpad/abort-transaction-in-job-queue | ||||
| Related bugs: |
|
| Reviewer | Review Type | Date Requested | Status |
|---|---|---|---|
| j.c.sackett (community) | 2012-04-12 | Approve on 2012-04-12 | |
| Richard Harding (community) | code* | 2012-04-12 | Approve on 2012-04-12 |
|
Review via email:
|
|||
Commit Message
Parameter abort_transaction added to Job.queue()
Description of the Change
This branch adds a parameter "abort_transaction" to the method Job.queue().
Jobs are now controlled by the lazr.jobrunner mudule, where I recently
added a timeout mechanism: This Celery based job runner can define
more than one queues for the same jobs, with different timeout values.
If a job times out in a "fast" queue, it cen be re-queue in another
queue with a longer timeout value.
This requires to change the status of the job back to JobStatus.WAITING.
This is done in Job.queue() -- but this method is also called when
a job raises an exception that is listed in Job.retry_
This retry mechanism assumes that the job left the database in a
consistent state when it raised the "retry exception", hence Job.queue()
calls transaction.
is interrupted by a timeout, so I added the parameter abort_transaction,
which is True by default.
I also noticed that the paremeter manage_transaction of the methods
Job.start(), Job.complete(), Job.fail() etc was not tested, so I added
these tests:
./bin/test -vvt lp.services.
no lint
| Abel Deuring (adeuring) wrote : | # |
| Richard Harding (rharding) wrote : | # |
Abel, feel free to correct me here.
- I worry that having method params for manage_transaction that they could be called and not match. So start has it set to true, but not for fail and so we end up with left over transactions not getting processed correctly. It would seem an object level attribute of 'has' or 'uses' transaction would be a better single point to hold that all of the process methods check might help keep things more consistant.
- There's no mention of the LoC policy. Please note in the MP if this is offset somewhere.
- The comments in the test code seem prematurely new lined. For instance: #74-75 would fit in one line per out lint policy. Since this would reduce the LoC impact could these be adjusted?
Thanks for adding the new tests!
| Abel Deuring (adeuring) wrote : | # |
Rick, right, the parameter manage_transaction is indeed good candidate for inconsistencies. The problem here: The job runner implementation in the main LP code (now obsolete) called commit() or abort() where needed. The new implementation in lazr.jobrunner does not do this: It theory. it should completely ignore transactions, because these are relevant for _job_ but not the job runner. The problem with this approach is that some of the methods start(), queue() (sorry, can't remember right now which exactly) are called in parts of the main LP code which is not directly related to "real" job management. (I think there is for example code which creates a job instance and then immediately puts it into the status "suspended".) Committing transactions there would be bad, so I added the parameter manage_transaction. It is only set to True in lazr.jobrunner, but it is used the consistently.
Your suggestion to use something like Job.transaction to automatically decide if start(), complete() etc is very interesting but will require some more thoughts where this property should be added. Another option would be to replace the calls of queue() or suspend() outside the job runner with some other methods like "tweakJobStatus
But I would like to defer this to another branch.
LoC policy: Once lazr.jobrunner is fully implenented and running, Aaron and I expect that we can remove a reasonable amount of lines from LP.
| j.c.sackett (jcsackett) wrote : | # |
I have nothing further to add to Rick's comments; thanks for this code and the other jobrunner work.

...and I noticed that the methods start(), complete(),fail() etc of the class IJob were not up to date: I added the parameter manage_transaction.