Merge lp:~abentley/launchpad/memlimit-sendbranchmail into lp:launchpad

Proposed by Aaron Bentley
Status: Superseded
Proposed branch: lp:~abentley/launchpad/memlimit-sendbranchmail
Merge into: lp:launchpad
Diff against target: 384 lines (+116/-50)
8 files modified
cronscripts/sendbranchmail.py (+12/-15)
lib/lp/code/model/branchjob.py (+59/-11)
lib/lp/code/scripts/tests/test_sendbranchmail.py (+10/-10)
lib/lp/codehosting/vfs/transport.py (+10/-1)
lib/lp/registry/model/packagingjob.py (+2/-11)
lib/lp/services/utils.py (+20/-2)
lib/lp/translations/model/translationpackagingjob.py (+1/-0)
lib/lp/translations/tests/test_translationtemplatesbuildjob.py (+2/-0)
To merge this branch: bzr merge lp:~abentley/launchpad/memlimit-sendbranchmail
Reviewer Review Type Date Requested Status
Gavin Panella (community) Approve
Review via email: mp+65689@code.launchpad.net

This proposal has been superseded by a proposal from 2011-07-11.

Commit message

Memory-limit branch mail jobs.

Description of the change

= Summary =
Fix bug #585126: sendbranchmail with lp:~vcs-imports/linux/trunk is eating memory

== Proposed fix ==
Introduce a memory limit to branch mail jobs.

== Pre-implementation notes ==
None

== Implementation details ==
The TwistedJobRunner now supports memory limits for jobs, which means that each job runs in a separate process, and raises a MemoryError if it exceeds allowable memory use. This branch uses that mechanism for the sendbranchmail jobs.

In order to use the memory limit, we must switch to the TwistedJobRunner. In order to use the TwistedJobRunner, we need a suitable job source, so BranchMailJobSource was implemented. In order to implement BranchMailJobSource, we changed BranchJobDerived.get so that it will return instances of any given type. In order to make BranchJobDerived.get to return instances of any given type, we used RegisteredSubclass as its metaclass and provided _subclass and _register_subclass.

There were several drivebys as well:
AsyncVirtualServer can now behave as a ContextManager, meaning that the following works:
with get_ro_server():
    server_ops

Old bzr workarounds that are no longer needed were deleted.

Lots of lint was fixed

== Tests ==

bin/test -vt sendbranchmail -t branchjob

== Demo and Q/A ==
Not sure whether this is QAable.

= Launchpad lint =

Checking for conflicts and issues in changed files.

Linting changed files:
  lib/lp/translations/tests/test_translationtemplatesbuildjob.py
  cronscripts/sendbranchmail.py
  lib/lp/registry/model/packagingjob.py
  lib/lp/codehosting/vfs/transport.py
  lib/lp/code/scripts/tests/test_sendbranchmail.py
  lib/lp/code/model/branchjob.py
  lib/lp/services/utils.py

To post a comment you must log in.
Revision history for this message
Gavin Panella (allenap) wrote :

Looks good.

[1]

+ @staticmethod
+ def _register_subclass(cls):
+ """Register this class with its enumeration."""
+ # This would be a classmethod, except that subclasses (e.g.
+ # TranslationPackagingJob) need to be able to override it and call
+ # into it, and there's no syntax to call a base class's version of a
+ # classmethod with the subclass as the first parameter.

I think super(BranchJobDerived, cls)._register_subclass() might work.

[2]

+class BranchMailJobSource(BaseRunnableJobSource):
+ """Source of jobs that send mail about branches."""
+
+ memory_limit = 2 * (1024 ** 3)

2GB is quite a high limit. Is that correct?

review: Approve
Revision history for this message
Aaron Bentley (abentley) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11-06-23 12:36 PM, Gavin Panella wrote:
> [1]
>
> + @staticmethod
> + def _register_subclass(cls):
> + """Register this class with its enumeration."""
> + # This would be a classmethod, except that subclasses (e.g.
> + # TranslationPackagingJob) need to be able to override it and call
> + # into it, and there's no syntax to call a base class's version of a
> + # classmethod with the subclass as the first parameter.
>
> I think super(BranchJobDerived, cls)._register_subclass() might work.

This doesn't work because _register_subclass is called before
BranchJobDerived is part of the global namespace (because it's called by
the metaclass).

class Bar(type):

    def __init__(cls, name, bases, dict_):
        cls.upcall()

class Foo(object):

    __metaclass__ = Bar

    @classmethod
    def upcall(cls):
        super(Foo, cls)

$ python superme.py
Traceback (most recent call last):
  File "superme.py", line 7, in <module>
    class Foo(object):
  File "superme.py", line 4, in __init__
    cls.upcall()
  File "superme.py", line 13, in upcall
    super(Foo, cls)
NameError: global name 'Foo' is not defined

> [2]
>
> +class BranchMailJobSource(BaseRunnableJobSource):
> + """Source of jobs that send mail about branches."""
> +
> + memory_limit = 2 * (1024 ** 3)
>
> 2GB is quite a high limit. Is that correct?

I'm using the limit applied to cronscripts/scan_branches.py (in 12991)
as an example. sendbranchmail runs on the same machine and uses similar
technology, including bzrlib.

The bug reports that 5GB is problematic, so this cuts it by more than half.

If you're wondering whether bzrlib really needs that much, well, it has
been quite memory-hungry historically, but I understand bzr 2.4 will be
a vast improvement.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk4DeNAACgkQ0F+nu1YWqI2WkQCfZri8hj8jxxSp76TSrDZ1cvAL
7vEAniM+dsU6qFgVa72NuZvtr/YxM9PU
=Mzw0
-----END PGP SIGNATURE-----

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'cronscripts/sendbranchmail.py'
2--- cronscripts/sendbranchmail.py 2010-05-19 18:07:56 +0000
3+++ cronscripts/sendbranchmail.py 2011-06-23 21:13:26 +0000
4@@ -1,6 +1,6 @@
5 #!/usr/bin/python -S
6 #
7-# Copyright 2009 Canonical Ltd. This software is licensed under the
8+# Copyright 2008-2011 Canonical Ltd. This software is licensed under the
9 # GNU Affero General Public License version 3 (see the file LICENSE).
10
11 # pylint: disable-msg=W0403
12@@ -12,14 +12,18 @@
13
14 __metaclass__ = type
15
16+
17+import logging
18+
19 import _pythonpath
20-from zope.component import getUtility
21
22 from canonical.config import config
23-from lp.codehosting.vfs import get_ro_server
24-from lp.services.job.runner import JobRunner
25-from lp.code.interfaces.branchjob import (
26- IRevisionMailJobSource, IRevisionsAddedJobSource)
27+from lp.services.job.runner import (
28+ TwistedJobRunner,
29+ )
30+from lp.code.model.branchjob import (
31+ BranchMailJobSource,
32+ )
33 from lp.services.scripts.base import LaunchpadCronScript
34 from canonical.launchpad.webapp.errorlog import globalErrorUtility
35
36@@ -29,15 +33,8 @@
37
38 def main(self):
39 globalErrorUtility.configure('sendbranchmail')
40- jobs = list(getUtility(IRevisionMailJobSource).iterReady())
41- jobs.extend(getUtility(IRevisionsAddedJobSource).iterReady())
42- runner = JobRunner(jobs, self.logger)
43- server = get_ro_server()
44- server.start_server()
45- try:
46- runner.runAll()
47- finally:
48- server.stop_server()
49+ runner = TwistedJobRunner.runFromSource(
50+ BranchMailJobSource, 'send-branch-mail', logging.getLogger())
51 self.logger.info(
52 'Ran %d RevisionMailJobs.' % len(runner.completed_jobs))
53
54
55=== modified file 'lib/lp/code/model/branchjob.py'
56--- lib/lp/code/model/branchjob.py 2011-06-20 20:07:33 +0000
57+++ lib/lp/code/model/branchjob.py 2011-06-23 21:13:26 +0000
58@@ -13,6 +13,7 @@
59 ]
60
61 import contextlib
62+from itertools import chain
63 import operator
64 import os
65 import shutil
66@@ -106,7 +107,11 @@
67 from lp.scripts.helpers import TransactionFreeOperation
68 from lp.services.job.interfaces.job import JobStatus
69 from lp.services.job.model.job import Job
70-from lp.services.job.runner import BaseRunnableJob
71+from lp.services.job.runner import (
72+ BaseRunnableJob,
73+ BaseRunnableJobSource,
74+ )
75+from lp.services.utils import RegisteredSubclass
76 from lp.translations.interfaces.translationimportqueue import (
77 ITranslationImportQueue,
78 )
79@@ -217,6 +222,20 @@
80
81 class BranchJobDerived(BaseRunnableJob):
82
83+ __metaclass__ = RegisteredSubclass
84+ _subclass = {}
85+
86+ @staticmethod
87+ def _register_subclass(cls):
88+ """Register this class with its enumeration."""
89+ # Why not a classmethod? See RegisteredSubclass.__init__.
90+ job_type = getattr(cls, 'class_job_type', None)
91+ if job_type is not None:
92+ value = cls._subclass.setdefault(job_type, cls)
93+ assert value is cls, (
94+ '%s already registered to %s.' % (
95+ job_type.name, value.__name__))
96+
97 delegates(IBranchJob)
98
99 def __init__(self, branch_job):
100@@ -251,19 +270,29 @@
101 And(BranchJob.job_type == cls.class_job_type,
102 BranchJob.job == Job.id,
103 Job.id.is_in(Job.ready_jobs)))
104- return (cls(job) for job in jobs)
105-
106- @classmethod
107- def get(cls, key):
108- """Return the instance of this class whose key is supplied.
109+ return (cls._getInstance(job) for job in jobs)
110+
111+ @classmethod
112+ def _getInstance(cls, job):
113+ return cls._subclass[job.job_type](job)
114+
115+ @classmethod
116+ def get(cls, key, desired_classes=None):
117+ """Return the instance of this class (or a subclass) whose key
118+ is supplied. Calling this method on a subclass returns only values
119+ for that subclass.
120
121 :raises: SQLObjectNotFound
122 """
123- instance = IStore(BranchJob).get(BranchJob, key)
124- if instance is None or instance.job_type != cls.class_job_type:
125- raise SQLObjectNotFound(
126- 'No occurrence of %s has key %s' % (cls.__name__, key))
127- return cls(instance)
128+ if desired_classes is None:
129+ desired_classes = cls
130+ branchjob = IStore(BranchJob).get(BranchJob, key)
131+ if branchjob is not None:
132+ job = cls._getInstance(branchjob)
133+ if isinstance(job, cls):
134+ return job
135+ raise SQLObjectNotFound(
136+ 'No occurrence of %s has key %s' % (cls.__name__, key))
137
138 def getOopsVars(self):
139 """See `IRunnableJob`."""
140@@ -764,6 +793,25 @@
141 return outf.getvalue()
142
143
144+class BranchMailJobSource(BaseRunnableJobSource):
145+ """Source of jobs that send mail about branches."""
146+
147+ memory_limit = 2 * (1024 ** 3)
148+
149+ @staticmethod
150+ def contextManager():
151+ return get_ro_server()
152+
153+ @staticmethod
154+ def iterReady():
155+ return chain(
156+ RevisionMailJob.iterReady(), RevisionsAddedJob.iterReady())
157+
158+ @staticmethod
159+ def get(key):
160+ return BranchJobDerived.get(key, (RevisionMailJob, RevisionsAddedJob))
161+
162+
163 class RosettaUploadJob(BranchJobDerived):
164 """A Job that uploads translation files to Rosetta."""
165
166
167=== modified file 'lib/lp/code/scripts/tests/test_sendbranchmail.py'
168--- lib/lp/code/scripts/tests/test_sendbranchmail.py 2010-10-26 15:47:24 +0000
169+++ lib/lp/code/scripts/tests/test_sendbranchmail.py 2011-06-23 21:13:26 +0000
170@@ -20,7 +20,6 @@
171 RevisionMailJob,
172 RevisionsAddedJob,
173 )
174-from lp.services.osutils import override_environ
175 from lp.testing import TestCaseWithFactory
176
177
178@@ -41,8 +40,7 @@
179 tree.add('foo')
180 # XXX: AaronBentley 2010-08-06 bug=614404: a bzr username is
181 # required to generate the revision-id.
182- with override_environ(BZR_EMAIL='me@example.com'):
183- tree.commit('Added foo.', rev_id='rev1')
184+ tree.commit('Added foo.', rev_id='rev1', committer='me@example.com')
185 return branch, tree
186
187 def test_sendbranchmail(self):
188@@ -55,7 +53,9 @@
189 retcode, stdout, stderr = run_script(
190 'cronscripts/sendbranchmail.py', [])
191 self.assertEqual(
192- 'INFO Creating lockfile: /var/lock/launchpad-sendbranchmail.lock\n'
193+ 'INFO Creating lockfile: '
194+ '/var/lock/launchpad-sendbranchmail.lock\n'
195+ 'INFO Running through Twisted.\n'
196 'INFO Ran 1 RevisionMailJobs.\n', stderr)
197 self.assertEqual('', stdout)
198 self.assertEqual(0, retcode)
199@@ -70,7 +70,8 @@
200 retcode, stdout, stderr = run_script(
201 'cronscripts/sendbranchmail.py', [])
202 self.assertIn(
203- 'INFO Creating lockfile: /var/lock/launchpad-sendbranchmail.lock\n',
204+ 'INFO Creating lockfile: '
205+ '/var/lock/launchpad-sendbranchmail.lock\n',
206 stderr)
207 self.assertIn('INFO Job resulted in OOPS:', stderr)
208 self.assertIn('INFO Ran 0 RevisionMailJobs.\n', stderr)
209@@ -82,17 +83,16 @@
210 self.useBzrBranches()
211 branch, tree = self.createBranch()
212 tree.bzrdir.root_transport.put_bytes('foo', 'baz')
213- # XXX: AaronBentley 2010-08-06 bug=614404: a bzr username is
214- # required to generate the revision-id.
215- with override_environ(BZR_EMAIL='me@example.com'):
216- tree.commit('Added foo.', rev_id='rev2')
217+ tree.commit('Added foo.', rev_id='rev2', committer='me@example.com')
218 RevisionsAddedJob.create(
219 branch, 'rev1', 'rev2', 'from@example.org')
220 transaction.commit()
221 retcode, stdout, stderr = run_script(
222 'cronscripts/sendbranchmail.py', [])
223 self.assertEqual(
224- 'INFO Creating lockfile: /var/lock/launchpad-sendbranchmail.lock\n'
225+ 'INFO Creating lockfile:'
226+ ' /var/lock/launchpad-sendbranchmail.lock\n'
227+ 'INFO Running through Twisted.\n'
228 'INFO Ran 1 RevisionMailJobs.\n',
229 stderr)
230 self.assertEqual('', stdout)
231
232=== modified file 'lib/lp/codehosting/vfs/transport.py'
233--- lib/lp/codehosting/vfs/transport.py 2011-03-29 03:24:05 +0000
234+++ lib/lp/codehosting/vfs/transport.py 2011-06-23 21:13:26 +0000
235@@ -102,7 +102,7 @@
236 """Return the absolute, escaped path to `relpath` without the schema.
237 """
238 return urlutils.joinpath(
239- self.base[len(self.server.get_url())-1:], relpath)
240+ self.base[len(self.server.get_url()) - 1:], relpath)
241
242 def _getUnderylingTransportAndPath(self, relpath):
243 """Return the underlying transport and path for `relpath`."""
244@@ -178,6 +178,7 @@
245
246 def iter_files_recursive(self):
247 deferred = self._getUnderylingTransportAndPath('.')
248+
249 @no_traceback_failures
250 def iter_files((transport, path)):
251 return transport.clone(path).iter_files_recursive()
252@@ -186,6 +187,7 @@
253
254 def listable(self):
255 deferred = self._getUnderylingTransportAndPath('.')
256+
257 @no_traceback_failures
258 def listable((transport, path)):
259 return transport.listable()
260@@ -427,3 +429,10 @@
261 return
262 self._is_started = False
263 unregister_transport(self.get_url(), self._transportFactory)
264+
265+ def __enter__(self):
266+ self.start_server()
267+ return self
268+
269+ def __exit__(self, exc_type, exc_value, traceback):
270+ self.stop_server()
271
272=== modified file 'lib/lp/registry/model/packagingjob.py'
273--- lib/lp/registry/model/packagingjob.py 2011-03-23 16:28:51 +0000
274+++ lib/lp/registry/model/packagingjob.py 2011-06-23 21:13:26 +0000
275@@ -35,6 +35,7 @@
276 JobStatus,
277 )
278 from lp.services.job.model.job import Job
279+from lp.services.utils import RegisteredSubclass
280
281
282 class PackagingJobType(DBEnumeratedType):
283@@ -96,13 +97,6 @@
284 self.productseries = productseries
285
286
287-class RegisteredSubclass(type):
288- """Metaclass for when subclasses should be registered."""
289-
290- def __init__(cls, name, bases, dict_):
291- cls._register_subclass(cls)
292-
293-
294 class PackagingJobDerived:
295 """Base class for specialized Packaging Job types."""
296
297@@ -120,10 +114,7 @@
298 @staticmethod
299 def _register_subclass(cls):
300 """Register this class with its enumeration."""
301- # This would be a classmethod, except that subclasses (e.g.
302- # TranslationPackagingJob) need to be able to override it and call
303- # into it, and there's no syntax to call a base class's version of a
304- # classmethod with the subclass as the first parameter.
305+ # Why not a classmethod? See RegisteredSubclass.__init__.
306 job_type = getattr(cls, 'class_job_type', None)
307 if job_type is not None:
308 value = cls._subclass.setdefault(job_type, cls)
309
310=== modified file 'lib/lp/services/utils.py'
311--- lib/lp/services/utils.py 2011-04-01 16:30:57 +0000
312+++ lib/lp/services/utils.py 2011-06-23 21:13:26 +0000
313@@ -18,6 +18,7 @@
314 'file_exists',
315 'iter_list_chunks',
316 'iter_split',
317+ 'RegisteredSubclass',
318 'run_capturing_output',
319 'synchronize',
320 'text_delta',
321@@ -129,7 +130,7 @@
322 I'm amazed this isn't in itertools (mwhudson).
323 """
324 for i in range(0, len(a_list), size):
325- yield a_list[i:i+size]
326+ yield a_list[i:i + size]
327
328
329 def synchronize(source, target, add, remove):
330@@ -239,7 +240,7 @@
331 then reassemble.
332 """
333 # Make sure there is at least one newline so the split works.
334- first, rest = (s+'\n').split('\n', 1)
335+ first, rest = (s + '\n').split('\n', 1)
336 return (first + '\n' + dedent(rest)).strip()
337
338
339@@ -285,3 +286,20 @@
340 variables, and helps to avoid typos.
341 """
342 sys._getframe(1).f_locals["__traceback_info__"] = info
343+
344+
345+class RegisteredSubclass(type):
346+ """Metaclass for when subclasses should be registered."""
347+
348+ def __init__(cls, name, bases, dict_):
349+ # _register_subclass must be a static method to permit upcalls.
350+ #
351+ # We cannot use super(Class, cls) to do the upcalls, because Class
352+ # isn't fully defined yet. (Remember, we're calling this from a
353+ # metaclass.)
354+ #
355+ # Without using super, a classmethod that overrides another
356+ # classmethod has no reasonable way to call the overridden version AND
357+ # provide its class as first parameter (i.e. "cls"). Therefore, we
358+ # must use a static method.
359+ cls._register_subclass(cls)
360
361=== modified file 'lib/lp/translations/model/translationpackagingjob.py'
362--- lib/lp/translations/model/translationpackagingjob.py 2011-05-12 20:21:58 +0000
363+++ lib/lp/translations/model/translationpackagingjob.py 2011-06-23 21:13:26 +0000
364@@ -52,6 +52,7 @@
365
366 @staticmethod
367 def _register_subclass(cls):
368+ # Why not a classmethod? See RegisteredSubclass.__init__.
369 PackagingJobDerived._register_subclass(cls)
370 job_type = getattr(cls, 'class_job_type', None)
371 if job_type is not None:
372
373=== modified file 'lib/lp/translations/tests/test_translationtemplatesbuildjob.py'
374--- lib/lp/translations/tests/test_translationtemplatesbuildjob.py 2011-05-27 21:12:25 +0000
375+++ lib/lp/translations/tests/test_translationtemplatesbuildjob.py 2011-06-23 21:13:26 +0000
376@@ -156,6 +156,8 @@
377 behavior by setting an attribute of the class (not an object!) at
378 the beginning of every test.
379 """
380+ class_job_type = None
381+
382 # Fake _hasPotteryCompatibleSetup, and if so, make it give what
383 # answer?
384 fake_pottery_compatibility = None