Merge lp:~jtv/launchpad/bug-702228 into lp:launchpad

Proposed by Jeroen T. Vermeulen
Status: Merged
Approved by: Robert Collins
Approved revision: no longer in the source branch.
Merged at revision: 12215
Proposed branch: lp:~jtv/launchpad/bug-702228
Merge into: lp:launchpad
Diff against target: 242 lines (+134/-27)
4 files modified
cronscripts/rosetta-export-queue.py (+5/-2)
lib/lp/translations/doc/poexport-queue.txt (+1/-1)
lib/lp/translations/scripts/po_export_queue.py (+61/-24)
lib/lp/translations/tests/test_exportresult.py (+67/-0)
To merge this branch: bzr merge lp:~jtv/launchpad/bug-702228
Reviewer Review Type Date Requested Status
j.c.sackett (community) code* Approve
Robert Collins (community) Approve
Review via email: mp+46165@code.launchpad.net

Commit message

[r=jcsackett,lifeless][ui=none][bug=702228] Avoid long-running master transactions during translations export.

Description of the change

= Bug 702228 =

A particularly difficult translations export request was breaking the export queue. It took over 3 hours to run, at which point our watchdog script that looks for long transactions on the master database killed its master store connection.

This branch addresses that by avoiding use of the master database, so that the connection will not start a transaction until the very end of a request (where a master connection is really needed). An occasional long transaction on the slave is not such a big problem, and certainly won't take any write locks. So the watchdog will leave the slave connection in peace.

There is a nesting of database policies: the SlaveDatabasePolicy makes the slave store the default, so that e.g. some use of cursor() that we still have will use the slave. This is merely a little help in reducing accidental use of the master store. But I also bracketed the time-critical part in a SlaveOnlyDatabasePolicy. This outright disallows use of the master store.

It's not quite obvious, but the method that fetches the next request off the queue accesses the master store. It doesn't make any changes there, but it does verify that the oldest fully-replicated request it gets off the slave store has not already been deleted from the master store. That situation could otherwise occur if the script deleted the previous request from the master store, then again tried to fetch the oldest request off the slave before the deletion had replicated.

I tested thusly:
{{{
./bin/test -vvc lp.translations -t export
}}}

To Q/A, first ensure that regular exports still work (both in Ubuntu and upstream; include XPI) by running the script manually, and seeing that it still produces valid tarballs and individual files. On staging that requires manually clearing the backlog first.

Watch pg_stat_activity on the master database during a large export. The export script's master backend process (as logged by the script) should be "<IDLE>" until the end, not "<IDLE> in transaction."

No lint,

Jeroen

To post a comment you must log in.
Revision history for this message
Robert Collins (lifeless) :
review: Approve
Revision history for this message
j.c.sackett (jcsackett) wrote :
Download full text (3.9 KiB)

Jeroen--

This looks good. I have couple comments below, but the one on the test case is the only one I feel is important. This is good to land with that change.

> === modified file 'lib/lp/translations/scripts/po_export_queue.py'
> --- lib/lp/translations/scripts/po_export_queue.py 2010-08-20 20:31:18 +0000
> +++ lib/lp/translations/scripts/po_export_queue.py 2011-01-13 18:25:08 +0000
> @@ -250,6 +252,46 @@
> 'request_url': self.request_url,
> }
>
> + def setExportFile(self, exported_file):
> + """Attach an exported file to the result, for upload to the Librarian.
> +
> + After this is set, `upload` will perform the actual upload. The two
> + actions are separated so as to isolate write access to the database.
> +
> + :param exported_file: An `IExportedTranslationFile` containing the
> + exported data.
> + """
> + self.exported_file = exported_file

Given how very little this method is doing, I'm not sure it needs to be a separate method. It's not hurting anything to have it this way, but seems slightly unnecesary.

> === added file 'lib/lp/translations/tests/test_exportresult.py'
> --- lib/lp/translations/tests/test_exportresult.py 1970-01-01 00:00:00 +0000
> +++ lib/lp/translations/tests/test_exportresult.py 2011-01-13 18:25:08 +0000
> @@ -0,0 +1,67 @@
> +# Copyright 2010 Canonical Ltd. This software is licensed under the
> +# GNU Affero General Public License version 3 (see the file LICENSE).
> +
> +"""Tests for `ExportResult`."""
> +
> +__metaclass__ = type
> +
> +import hashlib
> +import os.path
> +from StringIO import StringIO
> +
> +from canonical.librarian.testing.fake import FakeLibrarian
> +from canonical.testing.layers import ZopelessDatabaseLayer
> +from lp.services.log.logger import DevNullLogger
> +from lp.testing import TestCaseWithFactory
> +from lp.translations.scripts.po_export_queue import ExportResult
> +
> +
> +class FakeExportedTranslationFile:
> + """Fake `IExportedTranslationFile` for testing."""
> +
> + def __init__(self, path, content, content_type):
> + self.path = path
> + base, ext = os.path.splitext(path)
> + self.file_extension = ext
> + self.size = len(content)
> + self.content = content
> + self.file = StringIO(content)
> + self.content_type = content_type
> +
> + def read(self, *args, **kwargs):
> + return self.file.read(*args, **kwargs)
> +
> +
> +class TestExportResult(TestCaseWithFactory):
> +
> + layer = ZopelessDatabaseLayer
> +
> + def makeExportResult(self):
> + request = [self.factory.makePOFile()]
> + requester = self.factory.makePerson()
> + logger = DevNullLogger()
> + return ExportResult(requester, request, logger)
> +
> + def makeExportedTranslationFile(self):
> + filename = self.factory.getUniqueString()
> + content = self.factory.getUniqueString()
> + mime_type = "text/plain"
> + return FakeExportedTranslationFile(filename, content, mime_type)
> +
> + def test_upload_exported_file(self):
> + librarian = self.useFixture(FakeLibrarian())
> + export = self.makeExp...

Read more...

review: Approve (code*)
Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

You're right about the assertNotEqual. Fixed with thanks.

You may also be right about the other method, but I can't think of anything to do about it. I wouldn't want to set the property on the object from outside the object's own code: I feel that would be highly nonobvious, sort of like an out-of-place parameter. So I left that as it was.

Jeroen

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'cronscripts/rosetta-export-queue.py'
--- cronscripts/rosetta-export-queue.py 2010-04-27 19:48:39 +0000
+++ cronscripts/rosetta-export-queue.py 2011-01-13 23:31:34 +0000
@@ -8,17 +8,20 @@
8import _pythonpath8import _pythonpath
99
10from canonical.database.sqlbase import ISOLATION_LEVEL_READ_COMMITTED10from canonical.database.sqlbase import ISOLATION_LEVEL_READ_COMMITTED
11from canonical.launchpad.webapp.dbpolicy import SlaveDatabasePolicy
11from lp.translations.scripts.po_export_queue import process_queue12from lp.translations.scripts.po_export_queue import process_queue
12from lp.services.scripts.base import LaunchpadCronScript13from lp.services.scripts.base import LaunchpadCronScript
1314
1415
15class RosettaExportQueue(LaunchpadCronScript):16class RosettaExportQueue(LaunchpadCronScript):
17 """Translation exports."""
18
16 def main(self):19 def main(self):
17 self.txn.set_isolation_level(ISOLATION_LEVEL_READ_COMMITTED)20 self.txn.set_isolation_level(ISOLATION_LEVEL_READ_COMMITTED)
18 process_queue(self.txn, self.logger)21 with SlaveDatabasePolicy():
22 process_queue(self.txn, self.logger)
1923
2024
21if __name__ == '__main__':25if __name__ == '__main__':
22 script = RosettaExportQueue('rosetta-export-queue', dbuser='poexport')26 script = RosettaExportQueue('rosetta-export-queue', dbuser='poexport')
23 script.lock_and_run()27 script.lock_and_run()
24
2528
=== modified file 'lib/lp/translations/doc/poexport-queue.txt'
--- lib/lp/translations/doc/poexport-queue.txt 2010-12-02 16:13:51 +0000
+++ lib/lp/translations/doc/poexport-queue.txt 2011-01-13 23:31:34 +0000
@@ -42,7 +42,7 @@
42 ... result.addFailure()42 ... result.addFailure()
4343
44In this case, there is an error, so there shouldn't be a URL to download44In this case, there is an error, so there shouldn't be a URL to download
45anything, if we set it, the system will fail:45anything. If we set it, the system will fail:
4646
47 >>> result.url = 'http://someplace.com/somefile.tar.gz'47 >>> result.url = 'http://someplace.com/somefile.tar.gz'
4848
4949
=== modified file 'lib/lp/translations/scripts/po_export_queue.py'
--- lib/lp/translations/scripts/po_export_queue.py 2010-08-20 20:31:18 +0000
+++ lib/lp/translations/scripts/po_export_queue.py 2011-01-13 23:31:34 +0000
@@ -23,6 +23,7 @@
23from canonical.launchpad.interfaces.librarian import ILibraryFileAliasSet23from canonical.launchpad.interfaces.librarian import ILibraryFileAliasSet
24from canonical.launchpad.mail import simple_sendmail24from canonical.launchpad.mail import simple_sendmail
25from canonical.launchpad.webapp import canonical_url25from canonical.launchpad.webapp import canonical_url
26from canonical.launchpad.webapp.dbpolicy import SlaveOnlyDatabasePolicy
26from lp.registry.interfaces.productseries import IProductSeries27from lp.registry.interfaces.productseries import IProductSeries
27from lp.registry.interfaces.sourcepackage import ISourcePackage28from lp.registry.interfaces.sourcepackage import ISourcePackage
28from lp.translations.interfaces.poexportrequest import IPOExportRequestSet29from lp.translations.interfaces.poexportrequest import IPOExportRequestSet
@@ -54,6 +55,7 @@
54 self.url = None55 self.url = None
55 self.failure = None56 self.failure = None
56 self.logger = logger57 self.logger = logger
58 self.exported_file = None
5759
58 self.requested_exports = list(requested_exports)60 self.requested_exports = list(requested_exports)
59 export_requested_at = self._getExportRequestOrigin()61 export_requested_at = self._getExportRequestOrigin()
@@ -250,6 +252,46 @@
250 'request_url': self.request_url,252 'request_url': self.request_url,
251 }253 }
252254
255 def setExportFile(self, exported_file):
256 """Attach an exported file to the result, for upload to the Librarian.
257
258 After this is set, `upload` will perform the actual upload. The two
259 actions are separated so as to isolate write access to the database.
260
261 :param exported_file: An `IExportedTranslationFile` containing the
262 exported data.
263 """
264 self.exported_file = exported_file
265
266 def upload(self, logger=None):
267 """Upload exported file as set with `setExportFile` to the Librarian.
268
269 If no file has been set, do nothing.
270 """
271 if self.exported_file is None:
272 # There's nothing to upload.
273 return
274
275 if self.exported_file.path is None:
276 # The exported path is unknown, use translation domain as its
277 # filename.
278 assert self.exported_file.file_extension, (
279 'File extension must have a value!.')
280 path = 'launchpad-export.%s' % self.exported_file.file_extension
281 else:
282 # Convert the path to a single file name so it's noted in
283 # librarian.
284 path = self.exported_file.path.replace(os.sep, '_')
285
286 alias_set = getUtility(ILibraryFileAliasSet)
287 alias = alias_set.create(
288 name=path, size=self.exported_file.size, file=self.exported_file,
289 contentType=self.exported_file.content_type)
290
291 self.url = alias.http_url
292 if logger is not None:
293 logger.info("Stored file at %s" % self.url)
294
253 def notify(self):295 def notify(self):
254 """Send a notification email to the given person about the export.296 """Send a notification email to the given person about the export.
255297
@@ -331,6 +373,9 @@
331 multiple files) and information about files that we failed to export (if373 multiple files) and information about files that we failed to export (if
332 any).374 any).
333 """375 """
376 # Keep as much work off the master store as possible, so we avoid
377 # opening unnecessary transactions there. It could be a while
378 # before we get to the commit.
334 translation_exporter = getUtility(ITranslationExporter)379 translation_exporter = getUtility(ITranslationExporter)
335 requested_objects = list(objects)380 requested_objects = list(objects)
336381
@@ -352,28 +397,9 @@
352 # error, we add the entry to the list of errors.397 # error, we add the entry to the list of errors.
353 result.addFailure()398 result.addFailure()
354 else:399 else:
355 if exported_file.path is None:400 result.setExportFile(exported_file)
356 # The exported path is unknown, use translation domain as its401
357 # filename.402 return result
358 assert exported_file.file_extension, (
359 'File extension must have a value!.')
360 exported_path = 'launchpad-export.%s' % (
361 exported_file.file_extension)
362 else:
363 # Convert the path to a single file name so it's noted in
364 # librarian.
365 exported_path = exported_file.path.replace(os.sep, '_')
366
367 alias_set = getUtility(ILibraryFileAliasSet)
368 alias = alias_set.create(
369 name=exported_path,
370 size=exported_file.size,
371 file=exported_file,
372 contentType=exported_file.content_type)
373 result.url = alias.http_url
374 logger.info("Stored file at %s" % result.url)
375
376 result.notify()
377403
378404
379def process_queue(transaction_manager, logger):405def process_queue(transaction_manager, logger):
@@ -386,8 +412,19 @@
386412
387 request = request_set.getRequest()413 request = request_set.getRequest()
388 while request != no_request:414 while request != no_request:
389 person, objects, format, request_ids = request415
390 process_request(person, objects, format, logger)416 # This can take a long time. Make sure we don't open any
417 # transactions on the master store before we really need to.
418 transaction_manager.commit()
419 with SlaveOnlyDatabasePolicy():
420 person, objects, format, request_ids = request
421 result = process_request(person, objects, format, logger)
422
423 # Almost done. Now we can go back to using the master database
424 # where needed.
425 result.upload(logger=logger)
426 result.notify()
427
391 request_set.removeRequest(request_ids)428 request_set.removeRequest(request_ids)
392 transaction_manager.commit()429 transaction_manager.commit()
393430
394431
=== added file 'lib/lp/translations/tests/test_exportresult.py'
--- lib/lp/translations/tests/test_exportresult.py 1970-01-01 00:00:00 +0000
+++ lib/lp/translations/tests/test_exportresult.py 2011-01-13 23:31:34 +0000
@@ -0,0 +1,67 @@
1# Copyright 2010 Canonical Ltd. This software is licensed under the
2# GNU Affero General Public License version 3 (see the file LICENSE).
3
4"""Tests for `ExportResult`."""
5
6__metaclass__ = type
7
8import hashlib
9import os.path
10from StringIO import StringIO
11
12from canonical.librarian.testing.fake import FakeLibrarian
13from canonical.testing.layers import ZopelessDatabaseLayer
14from lp.services.log.logger import DevNullLogger
15from lp.testing import TestCaseWithFactory
16from lp.translations.scripts.po_export_queue import ExportResult
17
18
19class FakeExportedTranslationFile:
20 """Fake `IExportedTranslationFile` for testing."""
21
22 def __init__(self, path, content, content_type):
23 self.path = path
24 base, ext = os.path.splitext(path)
25 self.file_extension = ext
26 self.size = len(content)
27 self.content = content
28 self.file = StringIO(content)
29 self.content_type = content_type
30
31 def read(self, *args, **kwargs):
32 return self.file.read(*args, **kwargs)
33
34
35class TestExportResult(TestCaseWithFactory):
36
37 layer = ZopelessDatabaseLayer
38
39 def makeExportResult(self):
40 request = [self.factory.makePOFile()]
41 requester = self.factory.makePerson()
42 logger = DevNullLogger()
43 return ExportResult(requester, request, logger)
44
45 def makeExportedTranslationFile(self):
46 filename = self.factory.getUniqueString()
47 content = self.factory.getUniqueString()
48 mime_type = "text/plain"
49 return FakeExportedTranslationFile(filename, content, mime_type)
50
51 def test_upload_exported_file(self):
52 librarian = self.useFixture(FakeLibrarian())
53 export = self.makeExportedTranslationFile()
54 export_result = self.makeExportResult()
55 export_result.setExportFile(export)
56 export_result.upload()
57
58 self.assertIsNot(None, export_result.url)
59 sha1 = hashlib.sha1(export.content).hexdigest()
60 self.assertEqual(sha1, librarian.aliases.values()[0].content.sha1)
61 alias = librarian.findBySHA1(sha1)
62 self.assertEqual(export.path, alias.filename)
63
64 def test_upload_without_exported_file_does_nothing(self):
65 export_result = self.makeExportResult()
66 export_result.upload()
67 self.assertIs(None, export_result.url)