Merge lp:~mluto/mailman/qrunnersleep into lp:mailman/2.1

Proposed by Michael Lutonsky
Status: Merged
Merged at revision: 1808
Proposed branch: lp:~mluto/mailman/qrunnersleep
Merge into: lp:mailman/2.1
Diff against target: 20 lines (+3/-0)
1 file modified
bin/qrunner (+3/-0)
To merge this branch: bzr merge lp:~mluto/mailman/qrunnersleep
Reviewer Review Type Date Requested Status
Mark Sapiro Approve
Review via email: mp+363767@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Mark Sapiro (msapiro) wrote :

Please explain what the need for this is. Normally, Mailman/Queue/Runner.py will sleep for QRUNNER_SLEEP_TIME after running _oneloop() if there are no more files in the queue.

Further, your change is only executed when bin/qrunner is run manually with --runner=All which is not the normal way to run all the runners. When they are run via bin/mailmanctl, bin/qrunner is called separately for each runner slice and the "Fast track for one infinite runner" code is executed.

review: Needs Information
Revision history for this message
Michael Lutonsky (mluto) wrote :

> Normally, Mailman/Queue/Runner.py will sleep for QRUNNER_SLEEP_TIME after running _oneloop() if there are no more files in the queue.

True! But the code sleeping in Runner.py isn't executed at all, because the check for self._stop breaks the loop before it gets a chance to do so. This leads to the Runner exiting right away and being immediately started again, in an endless loop, eating up a CPU core. I'd also be happy with moving the sleep like below, so this doesn't happen quite as fast (see diff below).

As far as I understand the code, this should happen in any mailman2.1 setup, when used with --runner=All.

> Further, your change is only executed when bin/qrunner is run manually with --runner=All which is not the normal way to run all the runners. When they are run via bin/mailmanctl, bin/qrunner is called separately for each runner slice and the "Fast track for one infinite runner" code is executed.

We run them with --runner=All so there is a process for our service manage to mange for. Starting them using mailmanctl forks them out of our control and then exits mailmanctl. The --runner=All option seems to be documented normally and not depreciated, so I expected it to work without any quirks. Please let me know if there is an alternative to that option.

--- Mailman/Queue/Runner.py 2018-06-17 23:47:34 +0000
+++ Mailman/Queue/Runner.py 2019-02-28 17:14:34 +0000
@@ -72,13 +72,13 @@
                     # shouldn't be called here. There should be one more
                     # _doperiodic() call at the end of the _oneloop() loop.
                     self._doperiodic()
- # If the stop flag is set, we're done.
- if self._stop:
- break
                     # Give the runner an opportunity to snooze for a while,
                     # but pass it the file count so it can decide whether to
                     # do more work now or not.
                     self._snooze(filecnt)
+ # If the stop flag is set, we're done.
+ if self._stop:
+ break
             except KeyboardInterrupt:
                 pass
         finally:

Revision history for this message
Mark Sapiro (msapiro) wrote :

This is complicated. With your second change every runner calls _snooze(filecnt) before exiting. This means that the "for qrunner in qrunners:" loop in bin/qrunner doesn't complete until every runner exits which in the case of RetryRunner is 15 minutes, so that's not acceptable.

Yet with the initial change, the "for qrunner in qrunners:" loop runs after sleeping only QRUNNER_SLEEP_TIME which means RetryRunner runs that often. I suppose that is not a serious issue because the actual retries are controlled by DELIVERY_RETRY_WAIT, at least since 2.1.26.

Revision history for this message
Mark Sapiro (msapiro) wrote :

Please open an issue at https://bugs.launchpad.net/mailman/+filebug for this. I will accept the original merge request as a fix for that.

review: Approve
Revision history for this message
Michael Lutonsky (mluto) wrote :

> Please open an issue at https://bugs.launchpad.net/mailman/+filebug for this.
> I will accept the original merge request as a fix for that.

Thank you! :)
https://bugs.launchpad.net/mailman/+bug/1818205

Revision history for this message
Michael Lutonsky (mluto) wrote :

Thanks for the merge!

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bin/qrunner'
2--- bin/qrunner 2018-06-17 23:47:34 +0000
3+++ bin/qrunner 2019-02-28 08:27:16 +0000
4@@ -75,6 +75,7 @@
5 import sys
6 import getopt
7 import signal
8+import time
9
10 import paths
11 from Mailman import mm_cfg
12@@ -268,6 +269,8 @@
13 qrunner.run()
14 if once:
15 break
16+ if mm_cfg.QRUNNER_SLEEP_TIME > 0:
17+ time.sleep(mm_cfg.QRUNNER_SLEEP_TIME)
18 syslog('qrunner', 'Main qrunner loop exiting.')
19 # All done
20 sys.exit(loop.status)