Merge lp:~jtv/launchpad/bug-427278 into lp:launchpad

Proposed by Jeroen T. Vermeulen
Status: Merged
Approved by: Graham Binns
Approved revision: not available
Merged at revision: not available
Proposed branch: lp:~jtv/launchpad/bug-427278
Merge into: lp:launchpad
Diff against target: 246 lines (+57/-15)
5 files modified
lib/lp/code/model/directbranchcommit.py (+17/-3)
lib/lp/code/tests/test_directbranchcommit.py (+1/-1)
lib/lp/translations/doc/translations-export-to-branch.txt (+11/-3)
lib/lp/translations/scripts/translations_to_branch.py (+13/-6)
lib/lp/translations/utilities/gettext_po_exporter.py (+15/-2)
To merge this branch: bzr merge lp:~jtv/launchpad/bug-427278
Reviewer Review Type Date Requested Status
Graham Binns (community) code Approve
Review via email: mp+15147@code.launchpad.net

Commit message

Avoid long idle-in-transaction periods in translations-to-branch export.

To post a comment you must log in.
Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

= Bug 427278 =

Second attempt; previously accidentally proposed for merging into db-devel (the default) which polluted the diff with whatever skew there happened to be between devel and db-devel.

Translations exports to bzr branches can sometimes keep transactions open for a while. There's no particular reason to go that far, and it's Not Nice. We've had the script idle-killed a few times.

We've done several things that prevented a repetition:

1. We no longer export all translations for a template; now we only do ones that have changed since the last export. Makes the exports a lot faster.

2. The deadline for long transactions was extended, and obviously that's pretty effective at stopping the alarm bells from ringing. But ideal it is not.

3. The branch you see here makes the script commit more often around the likely pain points.

There's really only one catch: DirectBranchCommit checks the branch for conflicting changes. For that it needs to compare the latest revision in the branch to the last_scanned_id as stored in the database when the DirectBranchCommit was constructed. So instead of getting the last_scanned_id at the point where it's referenced, DirectBranchCommit will now retrieve and store it at construction time. This should also close off a race condition that may or may not exist in the case where database transactions are committed while a DirectBranchCommit is active.

Passing a transaction into DirectBranchCommit.commit, as I do here, is a bit weird. There are reasons for it:

 * Operations that access branch storage can take a long time, but don't need database access. That makes them excellent points to close off any ongoing transaction. The framework starts new transactions on demand, and that's going to happen only after the bzr operation completes. Until that time, the committer avoids being "idle in transaction" on the database.

 * Nowadays txn is global, so we could just "import transaction" and "transaction.commit()". But passing it as an object makes it easy to mock transactions in tests. Also, passing a boolean is slightly more error-prone. Named parameters are a big improvement--"dbc.commit(message, commit=True)"--but still relatively sensitive to mistakes.

 * Passing the transaction object to commit rather than the constructor makes it clear that you're not giving out a blanket license to commit. The DirectBranchCommit can commit only inside that one particular method call. Don't hide transaction boundaries in the program flow.

A lot of the actual diff went into nicer logging.

== Test, demo & Q/A ==

Test:
{{{
./bin/test -vv -t 'translations.*to.*branch' -t 'direct.*branch.*commit'
}}}

Create a branch on staging, and set it as the productseries branch and the translations export branch for some project. Enable translations and two-way bzr synchronization. Push a template to the branch.

Watch the project's import queue; wait for the template to be imported. Then translate a string in the template. Watch the revision log for the branch; an automatic translations commit should appear there within the hour.

No lint.

Jeroen

Revision history for this message
Graham Binns (gmb) :
review: Approve (code)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lib/lp/code/model/directbranchcommit.py'
2--- lib/lp/code/model/directbranchcommit.py 2009-09-09 20:03:19 +0000
3+++ lib/lp/code/model/directbranchcommit.py 2009-11-26 21:25:26 +0000
4@@ -74,6 +74,8 @@
5 self.db_branch = db_branch
6 self.to_mirror = to_mirror
7
8+ self.last_scanned_id = self.db_branch.last_scanned_id
9+
10 if committer is None:
11 committer = db_branch.owner
12 self.committer = committer
13@@ -172,12 +174,18 @@
14 assert self.is_locked, "Getting revision on un-locked branch."
15 last_revision = None
16 last_revision = self.bzrbranch.last_revision()
17- if last_revision != self.db_branch.last_scanned_id:
18+ if last_revision != self.last_scanned_id:
19 raise ConcurrentUpdateError(
20 "Branch has been changed. Not committing.")
21
22- def commit(self, commit_message):
23- """Commit to branch."""
24+ def commit(self, commit_message, txn=None):
25+ """Commit to branch.
26+
27+ :param commit_message: Message for branch's commit log.
28+ :param txn: Transaction to commit. Can be helpful in avoiding
29+ long idle times in database transactions. May be committed
30+ more than once.
31+ """
32 assert self.is_open, "Committing closed DirectBranchCommit."
33 assert self.is_locked, "Not locked at commit time."
34
35@@ -185,6 +193,9 @@
36 try:
37 self._checkForRace()
38
39+ if txn:
40+ txn.commit()
41+
42 rev_id = self.revision_tree.get_revision_id()
43 if rev_id == NULL_REVISION:
44 if list(self.transform_preview.iter_changes()) == []:
45@@ -193,6 +204,9 @@
46 self.bzrbranch, commit_message)
47 IMasterObject(self.db_branch).requestMirror()
48
49+ if txn:
50+ txn.commit()
51+
52 finally:
53 self.unlock()
54 self.is_open = False
55
56=== modified file 'lib/lp/code/tests/test_directbranchcommit.py'
57--- lib/lp/code/tests/test_directbranchcommit.py 2009-10-02 09:48:30 +0000
58+++ lib/lp/code/tests/test_directbranchcommit.py 2009-11-26 21:25:26 +0000
59@@ -37,7 +37,7 @@
60
61 self.committer = DirectBranchCommit(self.db_branch)
62 if update_last_scanned_id:
63- self.db_branch.last_scanned_id = (
64+ self.committer.last_scanned_id = (
65 self.committer.bzrbranch.last_revision())
66
67 def _tearDownCommitter(self):
68
69=== modified file 'lib/lp/translations/doc/translations-export-to-branch.txt'
70--- lib/lp/translations/doc/translations-export-to-branch.txt 2009-10-20 06:03:46 +0000
71+++ lib/lp/translations/doc/translations-export-to-branch.txt 2009-11-26 21:25:26 +0000
72@@ -45,7 +45,7 @@
73 ... if self.simulate_race:
74 ... raise ConcurrentUpdateError("Simulated race condition.")
75 ...
76- ... def commit(self, message=None):
77+ ... def commit(self, message=None, txn=None):
78 ... self._checkForRace()
79 ... self.logger.info("Committed %d file(s)." % self.written_files)
80 ...
81@@ -129,11 +129,11 @@
82
83 >>> transaction.commit()
84 >>> script = MockExportTranslationsToBranch(
85- ... 'export-to-branch', test_args=['-vv'])
86+ ... 'export-to-branch', test_args=[])
87 >>> script.main()
88 INFO Exporting to translations branches.
89 INFO Exporting Gazblachko trunk series.
90- DEBUG No previous translations commit found.
91+ DEBUG ...
92 INFO Writing file 'po/main/nl.po':
93 INFO # ...
94 msgid ""
95@@ -143,6 +143,7 @@
96 msgid "maingazpot msgid"
97 msgstr "maingazpot msgstr"
98 <BLANKLINE>
99+ DEBUG ...
100 INFO Writing file 'po/module/nl.po':
101 INFO # ...
102 msgid ""
103@@ -152,6 +153,8 @@
104 <BLANKLINE>
105 msgid "gazmod msgid"
106 msgstr "gazmod msgstr"
107+ <BLANKLINE>
108+ DEBUG ...
109 INFO Committed 2 file(s).
110 INFO Unlock.
111 INFO Processed 1 item(s); 0 failure(s).
112@@ -186,6 +189,7 @@
113 >>> script.main()
114 INFO Exporting to translations branches.
115 INFO Exporting Gazblachko trunk series.
116+ DEBUG ....
117 DEBUG Last commit was at ....
118 INFO Unlock.
119 INFO Processed 1 item(s); 0 failure(s).
120@@ -197,6 +201,7 @@
121 >>> script.main()
122 INFO Exporting to translations branches.
123 INFO Exporting Gazblachko trunk series.
124+ DEBUG ....
125 DEBUG Last commit was at ...
126 INFO Writing file 'po/main/nl.po':
127 INFO ...
128@@ -228,11 +233,14 @@
129 >>> script.main()
130 INFO Exporting to translations branches.
131 INFO Exporting Gazblachko trunk series.
132+ DEBUG ....
133 DEBUG No previous translations commit found.
134+ DEBUG ....
135 INFO Writing file 'po/main/nl.po':
136 ...
137 msgstr "gazmod msgstr"
138 <BLANKLINE>
139+ DEBUG ...
140 INFO Unlock.
141 ERROR Failure: Simulated race condition.
142 INFO Processed 1 item(s); 1 failure(s).
143
144=== modified file 'lib/lp/translations/scripts/translations_to_branch.py'
145--- lib/lp/translations/scripts/translations_to_branch.py 2009-10-20 06:03:46 +0000
146+++ lib/lp/translations/scripts/translations_to_branch.py 2009-11-26 21:25:26 +0000
147@@ -119,7 +119,7 @@
148 def _commit(self, source, committer):
149 """Commit changes to branch. Check for race conditions."""
150 self._checkForObjections(source)
151- committer.commit(self.commit_message)
152+ committer.commit(self.commit_message, txn=self.txn)
153
154 def _isTranslationsCommit(self, revision):
155 """Is `revision` an automatic translations commit?"""
156@@ -159,6 +159,9 @@
157 self._checkForObjections(source)
158
159 committer = self._prepareBranchCommit(source.translations_branch)
160+ self.logger.debug("Created DirectBranchCommit.")
161+ if self.txn:
162+ self.txn.commit()
163
164 bzr_branch = committer.bzrbranch
165
166@@ -185,15 +188,17 @@
167 base_path = os.path.dirname(template.path)
168
169 for pofile in template.pofiles:
170-
171 has_changed = (
172 changed_since is None or
173 pofile.date_changed > changed_since)
174 if not has_changed:
175 continue
176
177+ language_code = pofile.getFullLanguageCode()
178+ self.logger.debug("Exporting %s." % language_code)
179+
180 pofile_path = os.path.join(
181- base_path, pofile.getFullLanguageCode() + '.po')
182+ base_path, language_code + '.po')
183 pofile_contents = pofile.export()
184
185 committer.writeFile(pofile_path, pofile_contents)
186@@ -210,6 +215,7 @@
187 template.clearPOFileCache()
188
189 if change_count > 0:
190+ self.logger.debug("Writing to branch.")
191 self._commit(source, committer)
192 finally:
193 committer.unlock()
194@@ -231,13 +237,14 @@
195 raise
196 except Exception, e:
197 items_failed += 1
198- self.logger.error("Failure: %s" % e)
199+ message = unicode(e)
200+ if message == u'':
201+ message = e.__class__.__name__
202+ self.logger.error("Failure: %s" % message)
203 if self.txn:
204 self.txn.abort()
205
206 items_done += 1
207- if self.txn:
208- self.txn.begin()
209
210 self.logger.info("Processed %d item(s); %d failure(s)." % (
211 items_done, items_failed))
212
213=== modified file 'lib/lp/translations/utilities/gettext_po_exporter.py'
214--- lib/lp/translations/utilities/gettext_po_exporter.py 2009-11-02 00:10:05 +0000
215+++ lib/lp/translations/utilities/gettext_po_exporter.py 2009-11-26 21:25:26 +0000
216@@ -14,6 +14,7 @@
217 'GettextPOExporter',
218 ]
219
220+import logging
221 import os
222
223 from zope.interface import implements
224@@ -344,8 +345,20 @@
225 raise UnicodeEncodeError(
226 '%s:\n%s' % (file_path, str(error)))
227
228- # This message cannot be represented in current encoding,
229- # change to UTF-8 and try again.
230+ # This message cannot be represented in the current
231+ # encoding.
232+ if translation_file.path:
233+ file_description = translation_file.path
234+ elif translation_file.language_code:
235+ file_description = (
236+ "%s translation" % translation_file.language_code)
237+ else:
238+ file_description = "template"
239+ logging.info(
240+ "Can't represent %s as %s; using UTF-8 instead." % (
241+ file_description,
242+ translation_file.header.charset.upper()))
243+
244 old_charset = translation_file.header.charset
245 translation_file.header.charset = 'UTF-8'
246 # We need to update the header too.