Merge lp:~benji/launchpad/bug-742662-2 into lp:launchpad

Proposed by Benji York
Status: Merged
Approved by: Benji York
Approved revision: no longer in the source branch.
Merged at revision: 13957
Proposed branch: lp:~benji/launchpad/bug-742662-2
Merge into: lp:launchpad
Diff against target: 180 lines (+54/-30)
3 files modified
lib/lp/translations/utilities/sanitize.py (+49/-23)
lib/lp/translations/utilities/tests/test_file_importer.py (+3/-4)
lib/lp/translations/utilities/tests/test_sanitize.py (+2/-3)
To merge this branch: bzr merge lp:~benji/launchpad/bug-742662-2
Reviewer Review Type Date Requested Status
Steve Kowalik (community) code Approve
Review via email: mp+75248@code.launchpad.net

Commit message

[r=stevenk][bug=742662] remove the validation of grandfathered-in translation data with mixed newlines

Description of the change

Bug 742662 describes a scenario in which very old imported translations
can contain mixed newlines (\r as well as \n). The fix (discovered by
doing an intra-implementation discussion with Henning) is to ignore the
mixed newlines in already-imported translations while preserving the
current behavior of disallowing any new imports with mixed newlines.

Since there was already two import sanitization functions
(sanitize_translations_from_import and sanitize_translations_from_webui)
which were just references to one-another, it was easy to provide new
definitions that behave as desired.

The only semi-invasive thing I had to do was to rework the place at
which the MixedNewlineMarkersError was raised. That's what caused the
introduction of the new mixed_style constant in the Sanitizer class and
the new handling thereof.

The tests had to be slightly modified to account for the new behavior.

Tests: bin/test -c -m lp.translations.utilities.tests

Lint: several pre-existing lint errors were fixed

QA: I would say to use the reproduction instructions on bug 742662 on
qastaging, but I can't reproduce the error now, so I'll have to figure
something else out.

To post a comment you must log in.
Revision history for this message
Steve Kowalik (stevenk) wrote :

This looks good. My only comment would be to fix the indentation for the sanitize_translations_from_webui() function.

review: Approve (code)
Revision history for this message
Benji York (benji) wrote :

On Tue, Sep 13, 2011 at 10:44 PM, Steve Kowalik <email address hidden> wrote:
> Review: Approve code

Thanks for the review!

> This looks good. My only comment would be to fix the indentation for the
> sanitize_translations_from_webui() function.

I can't tell which bit of indentation you're referring to. It looks pretty
normal to me.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lib/lp/translations/utilities/sanitize.py'
2--- lib/lp/translations/utilities/sanitize.py 2011-09-07 11:34:06 +0000
3+++ lib/lp/translations/utilities/sanitize.py 2011-09-13 19:59:35 +0000
4@@ -24,6 +24,7 @@
5 windows_style = u'\r\n'
6 mac_style = u'\r'
7 unix_style = u'\n'
8+ mixed_style = object()
9
10 dot_char = u'\u2022'
11
12@@ -44,28 +45,25 @@
13 self.prefix = ''
14 self.postfix = ''
15 # Get the newline style that is used in the English Singular.
16- self.newline_style = self._getNewlineStyle(
17- english_singular, 'Original')
18+ self.newline_style = self._getNewlineStyle(english_singular)
19
20- def _getNewlineStyle(self, text, text_name):
21+ @classmethod
22+ def _getNewlineStyle(cls, text):
23 """Find out which newline style is used in text."""
24- error_message = (
25- "%s text (%r) mixes different newline markers." % (
26- text_name, text))
27 style = None
28 # To avoid confusing the single-character newline styles for mac and
29 # unix with the two-character windows one, remove the windows-style
30 # newlines from the text and use that text to search for the other
31 # two.
32- stripped_text = text.replace(self.windows_style, u'')
33+ stripped_text = text.replace(cls.windows_style, u'')
34 if text != stripped_text:
35 # Text contains windows style new lines.
36- style = self.windows_style
37+ style = cls.windows_style
38
39- for one_char_style in (self.mac_style, self.unix_style):
40+ for one_char_style in (cls.mac_style, cls.unix_style):
41 if one_char_style in stripped_text:
42 if style is not None:
43- raise MixedNewlineMarkersError(error_message)
44+ return cls.mixed_style
45 style = one_char_style
46
47 return style
48@@ -135,25 +133,41 @@
49
50 def normalizeNewlines(self, translation_text):
51 """Return 'translation_text' with newlines sync with english_singular.
52+
53+ Raises an exception if the text has mixed newline styles.
54 """
55 if self.newline_style is None:
56 # No newlines in the English singular, so we have nothing to do.
57 return translation_text
58
59- # Get the style that uses the given text.
60- translation_newline_style = self._getNewlineStyle(
61- translation_text, 'Translations')
62+ # Get the style that is used in the given text.
63+ translation_newline_style = self._getNewlineStyle(translation_text)
64+
65+ if translation_newline_style == self.mixed_style:
66+ # The translation has mixed newlines in it; that is not allowed.
67+ raise MixedNewlineMarkersError(
68+ "Translations text (%r) mixes different newline markers." %
69+ translation_text)
70
71 if translation_newline_style is None:
72- # We don't need to do anything, the text is not changed.
73+ # The translation text doesn't contain any newlines, so there is
74+ # nothing for us to do.
75 return translation_text
76
77- # Fix the newline chars.
78- return translation_text.replace(
79- translation_newline_style, self.newline_style)
80-
81-
82-def sanitize_translations_from_webui(
83+ if self.newline_style is self.mixed_style:
84+ # The original has mixed newlines (some very old data are like
85+ # this, new data with mixed newlines are rejected), so we're just
86+ # going to punt and normalize to unix style.
87+ return translation_text.replace(
88+ translation_newline_style, self.unix_style)
89+ else:
90+ # Otherwise the translation text should be normalized to use the
91+ # same newline style as the original.
92+ return translation_text.replace(
93+ translation_newline_style, self.newline_style)
94+
95+
96+def sanitize_translations(
97 english_singular, translations, pluralforms):
98 """Sanitize `translations` using sanitize_translation.
99
100@@ -189,6 +203,18 @@
101
102 return sanitized_translations
103
104-# There will be a different function for translation coming from imports but
105-# for now it is identical to the one used in browser code.
106-sanitize_translations_from_import = sanitize_translations_from_webui
107+
108+def sanitize_translations_from_import(
109+ english_singular, translations, pluralforms):
110+ # At import time we want to ensure that the english_singular does not
111+ # contain mixed newline styles.
112+ if Sanitizer._getNewlineStyle(english_singular) is Sanitizer.mixed_style:
113+ raise MixedNewlineMarkersError(
114+ "Original text (%r) mixes different newline markers." %
115+ english_singular)
116+ return sanitize_translations(english_singular, translations, pluralforms)
117+
118+
119+def sanitize_translations_from_webui(
120+ english_singular, translations, pluralforms):
121+ return sanitize_translations(english_singular, translations, pluralforms)
122
123=== modified file 'lib/lp/translations/utilities/tests/test_file_importer.py'
124--- lib/lp/translations/utilities/tests/test_file_importer.py 2011-09-07 11:34:06 +0000
125+++ lib/lp/translations/utilities/tests/test_file_importer.py 2011-09-13 19:59:35 +0000
126@@ -246,7 +246,7 @@
127 message.addTranslation(0, u'')
128
129 potmsgset = self.factory.makePOTMsgSet(
130- potemplate = importer.potemplate, sequence=50)
131+ potemplate=importer.potemplate, sequence=50)
132 translation = importer.storeTranslationsInDatabase(
133 message, potmsgset)
134 # No TranslationMessage is created.
135@@ -566,7 +566,7 @@
136 def test_not_raises_OutdatedTranslationError_on_upstream_uploads(self):
137 queue_entry = self._make_queue_entry(True)
138 try:
139- importer = POFileImporter(queue_entry, GettextPOImporter(), None)
140+ POFileImporter(queue_entry, GettextPOImporter(), None)
141 except OutdatedTranslationError:
142 self.fail("OutdatedTranslationError raised.")
143
144@@ -574,7 +574,7 @@
145 queue_entry = self._make_queue_entry(True)
146 pofile = queue_entry.pofile
147 old_raw_header = pofile.header
148- importer = POFileImporter(queue_entry, GettextPOImporter(), None)
149+ POFileImporter(queue_entry, GettextPOImporter(), None)
150 self.assertEqual(old_raw_header, pofile.header)
151
152
153@@ -762,4 +762,3 @@
154 importer = POFileImporter(
155 entry, importers[TranslationFileFormat.PO], None)
156 self.assertTrue(importer.is_upstream_import_on_sourcepackage)
157-
158
159=== modified file 'lib/lp/translations/utilities/tests/test_sanitize.py'
160--- lib/lp/translations/utilities/tests/test_sanitize.py 2011-09-07 11:34:06 +0000
161+++ lib/lp/translations/utilities/tests/test_sanitize.py 2011-09-13 19:59:35 +0000
162@@ -133,7 +133,7 @@
163 english_newline, translation_text, sanitized))
164
165 def test_normalizeNewlines_mixed_newlines_english(self):
166- # Mixed newlines in the English text will raise an exception.
167+ # Mixed newlines in the English text will not raise an exception.
168 english_template = u"Text with%smixed%snewlines."
169 for english_newline_1 in self.newline_styles:
170 other_newlines = self.newline_styles[:]
171@@ -141,8 +141,7 @@
172 for english_newline_2 in other_newlines:
173 english_text = english_template % (
174 english_newline_1, english_newline_2)
175- self.assertRaises(
176- MixedNewlineMarkersError, Sanitizer, english_text)
177+ Sanitizer(english_text)
178
179 def test_normalizeNewlines_mixed_newlines_translation(self):
180 # Mixed newlines in the translation text will raise an exception.