Merge lp:~benji/launchpad/bug-742662-2 into lp:launchpad

Proposed by Benji York
Status: Merged
Approved by: Benji York
Approved revision: no longer in the source branch.
Merged at revision: 13957
Proposed branch: lp:~benji/launchpad/bug-742662-2
Merge into: lp:launchpad
Diff against target: 180 lines (+54/-30)
3 files modified
lib/lp/translations/utilities/sanitize.py (+49/-23)
lib/lp/translations/utilities/tests/test_file_importer.py (+3/-4)
lib/lp/translations/utilities/tests/test_sanitize.py (+2/-3)
To merge this branch: bzr merge lp:~benji/launchpad/bug-742662-2
Reviewer Review Type Date Requested Status
Steve Kowalik (community) code Approve
Review via email: mp+75248@code.launchpad.net

Commit message

[r=stevenk][bug=742662] remove the validation of grandfathered-in translation data with mixed newlines

Description of the change

Bug 742662 describes a scenario in which very old imported translations
can contain mixed newlines (\r as well as \n). The fix (discovered by
doing an intra-implementation discussion with Henning) is to ignore the
mixed newlines in already-imported translations while preserving the
current behavior of disallowing any new imports with mixed newlines.

Since there was already two import sanitization functions
(sanitize_translations_from_import and sanitize_translations_from_webui)
which were just references to one-another, it was easy to provide new
definitions that behave as desired.

The only semi-invasive thing I had to do was to rework the place at
which the MixedNewlineMarkersError was raised. That's what caused the
introduction of the new mixed_style constant in the Sanitizer class and
the new handling thereof.

The tests had to be slightly modified to account for the new behavior.

Tests: bin/test -c -m lp.translations.utilities.tests

Lint: several pre-existing lint errors were fixed

QA: I would say to use the reproduction instructions on bug 742662 on
qastaging, but I can't reproduce the error now, so I'll have to figure
something else out.

To post a comment you must log in.
Revision history for this message
Steve Kowalik (stevenk) wrote :

This looks good. My only comment would be to fix the indentation for the sanitize_translations_from_webui() function.

review: Approve (code)
Revision history for this message
Benji York (benji) wrote :

On Tue, Sep 13, 2011 at 10:44 PM, Steve Kowalik <email address hidden> wrote:
> Review: Approve code

Thanks for the review!

> This looks good. My only comment would be to fix the indentation for the
> sanitize_translations_from_webui() function.

I can't tell which bit of indentation you're referring to. It looks pretty
normal to me.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'lib/lp/translations/utilities/sanitize.py'
--- lib/lp/translations/utilities/sanitize.py 2011-09-07 11:34:06 +0000
+++ lib/lp/translations/utilities/sanitize.py 2011-09-13 19:59:35 +0000
@@ -24,6 +24,7 @@
24 windows_style = u'\r\n'24 windows_style = u'\r\n'
25 mac_style = u'\r'25 mac_style = u'\r'
26 unix_style = u'\n'26 unix_style = u'\n'
27 mixed_style = object()
2728
28 dot_char = u'\u2022'29 dot_char = u'\u2022'
2930
@@ -44,28 +45,25 @@
44 self.prefix = ''45 self.prefix = ''
45 self.postfix = ''46 self.postfix = ''
46 # Get the newline style that is used in the English Singular.47 # Get the newline style that is used in the English Singular.
47 self.newline_style = self._getNewlineStyle(48 self.newline_style = self._getNewlineStyle(english_singular)
48 english_singular, 'Original')
4949
50 def _getNewlineStyle(self, text, text_name):50 @classmethod
51 def _getNewlineStyle(cls, text):
51 """Find out which newline style is used in text."""52 """Find out which newline style is used in text."""
52 error_message = (
53 "%s text (%r) mixes different newline markers." % (
54 text_name, text))
55 style = None53 style = None
56 # To avoid confusing the single-character newline styles for mac and54 # To avoid confusing the single-character newline styles for mac and
57 # unix with the two-character windows one, remove the windows-style55 # unix with the two-character windows one, remove the windows-style
58 # newlines from the text and use that text to search for the other56 # newlines from the text and use that text to search for the other
59 # two.57 # two.
60 stripped_text = text.replace(self.windows_style, u'')58 stripped_text = text.replace(cls.windows_style, u'')
61 if text != stripped_text:59 if text != stripped_text:
62 # Text contains windows style new lines.60 # Text contains windows style new lines.
63 style = self.windows_style61 style = cls.windows_style
6462
65 for one_char_style in (self.mac_style, self.unix_style):63 for one_char_style in (cls.mac_style, cls.unix_style):
66 if one_char_style in stripped_text:64 if one_char_style in stripped_text:
67 if style is not None:65 if style is not None:
68 raise MixedNewlineMarkersError(error_message)66 return cls.mixed_style
69 style = one_char_style67 style = one_char_style
7068
71 return style69 return style
@@ -135,25 +133,41 @@
135133
136 def normalizeNewlines(self, translation_text):134 def normalizeNewlines(self, translation_text):
137 """Return 'translation_text' with newlines sync with english_singular.135 """Return 'translation_text' with newlines sync with english_singular.
136
137 Raises an exception if the text has mixed newline styles.
138 """138 """
139 if self.newline_style is None:139 if self.newline_style is None:
140 # No newlines in the English singular, so we have nothing to do.140 # No newlines in the English singular, so we have nothing to do.
141 return translation_text141 return translation_text
142142
143 # Get the style that uses the given text.143 # Get the style that is used in the given text.
144 translation_newline_style = self._getNewlineStyle(144 translation_newline_style = self._getNewlineStyle(translation_text)
145 translation_text, 'Translations')145
146 if translation_newline_style == self.mixed_style:
147 # The translation has mixed newlines in it; that is not allowed.
148 raise MixedNewlineMarkersError(
149 "Translations text (%r) mixes different newline markers." %
150 translation_text)
146151
147 if translation_newline_style is None:152 if translation_newline_style is None:
148 # We don't need to do anything, the text is not changed.153 # The translation text doesn't contain any newlines, so there is
154 # nothing for us to do.
149 return translation_text155 return translation_text
150156
151 # Fix the newline chars.157 if self.newline_style is self.mixed_style:
152 return translation_text.replace(158 # The original has mixed newlines (some very old data are like
153 translation_newline_style, self.newline_style)159 # this, new data with mixed newlines are rejected), so we're just
154160 # going to punt and normalize to unix style.
155161 return translation_text.replace(
156def sanitize_translations_from_webui(162 translation_newline_style, self.unix_style)
163 else:
164 # Otherwise the translation text should be normalized to use the
165 # same newline style as the original.
166 return translation_text.replace(
167 translation_newline_style, self.newline_style)
168
169
170def sanitize_translations(
157 english_singular, translations, pluralforms):171 english_singular, translations, pluralforms):
158 """Sanitize `translations` using sanitize_translation.172 """Sanitize `translations` using sanitize_translation.
159173
@@ -189,6 +203,18 @@
189203
190 return sanitized_translations204 return sanitized_translations
191205
192# There will be a different function for translation coming from imports but206
193# for now it is identical to the one used in browser code.207def sanitize_translations_from_import(
194sanitize_translations_from_import = sanitize_translations_from_webui208 english_singular, translations, pluralforms):
209 # At import time we want to ensure that the english_singular does not
210 # contain mixed newline styles.
211 if Sanitizer._getNewlineStyle(english_singular) is Sanitizer.mixed_style:
212 raise MixedNewlineMarkersError(
213 "Original text (%r) mixes different newline markers." %
214 english_singular)
215 return sanitize_translations(english_singular, translations, pluralforms)
216
217
218def sanitize_translations_from_webui(
219 english_singular, translations, pluralforms):
220 return sanitize_translations(english_singular, translations, pluralforms)
195221
=== modified file 'lib/lp/translations/utilities/tests/test_file_importer.py'
--- lib/lp/translations/utilities/tests/test_file_importer.py 2011-09-07 11:34:06 +0000
+++ lib/lp/translations/utilities/tests/test_file_importer.py 2011-09-13 19:59:35 +0000
@@ -246,7 +246,7 @@
246 message.addTranslation(0, u'')246 message.addTranslation(0, u'')
247247
248 potmsgset = self.factory.makePOTMsgSet(248 potmsgset = self.factory.makePOTMsgSet(
249 potemplate = importer.potemplate, sequence=50)249 potemplate=importer.potemplate, sequence=50)
250 translation = importer.storeTranslationsInDatabase(250 translation = importer.storeTranslationsInDatabase(
251 message, potmsgset)251 message, potmsgset)
252 # No TranslationMessage is created.252 # No TranslationMessage is created.
@@ -566,7 +566,7 @@
566 def test_not_raises_OutdatedTranslationError_on_upstream_uploads(self):566 def test_not_raises_OutdatedTranslationError_on_upstream_uploads(self):
567 queue_entry = self._make_queue_entry(True)567 queue_entry = self._make_queue_entry(True)
568 try:568 try:
569 importer = POFileImporter(queue_entry, GettextPOImporter(), None)569 POFileImporter(queue_entry, GettextPOImporter(), None)
570 except OutdatedTranslationError:570 except OutdatedTranslationError:
571 self.fail("OutdatedTranslationError raised.")571 self.fail("OutdatedTranslationError raised.")
572572
@@ -574,7 +574,7 @@
574 queue_entry = self._make_queue_entry(True)574 queue_entry = self._make_queue_entry(True)
575 pofile = queue_entry.pofile575 pofile = queue_entry.pofile
576 old_raw_header = pofile.header576 old_raw_header = pofile.header
577 importer = POFileImporter(queue_entry, GettextPOImporter(), None)577 POFileImporter(queue_entry, GettextPOImporter(), None)
578 self.assertEqual(old_raw_header, pofile.header)578 self.assertEqual(old_raw_header, pofile.header)
579579
580580
@@ -762,4 +762,3 @@
762 importer = POFileImporter(762 importer = POFileImporter(
763 entry, importers[TranslationFileFormat.PO], None)763 entry, importers[TranslationFileFormat.PO], None)
764 self.assertTrue(importer.is_upstream_import_on_sourcepackage)764 self.assertTrue(importer.is_upstream_import_on_sourcepackage)
765
766765
=== modified file 'lib/lp/translations/utilities/tests/test_sanitize.py'
--- lib/lp/translations/utilities/tests/test_sanitize.py 2011-09-07 11:34:06 +0000
+++ lib/lp/translations/utilities/tests/test_sanitize.py 2011-09-13 19:59:35 +0000
@@ -133,7 +133,7 @@
133 english_newline, translation_text, sanitized))133 english_newline, translation_text, sanitized))
134134
135 def test_normalizeNewlines_mixed_newlines_english(self):135 def test_normalizeNewlines_mixed_newlines_english(self):
136 # Mixed newlines in the English text will raise an exception.136 # Mixed newlines in the English text will not raise an exception.
137 english_template = u"Text with%smixed%snewlines."137 english_template = u"Text with%smixed%snewlines."
138 for english_newline_1 in self.newline_styles:138 for english_newline_1 in self.newline_styles:
139 other_newlines = self.newline_styles[:]139 other_newlines = self.newline_styles[:]
@@ -141,8 +141,7 @@
141 for english_newline_2 in other_newlines:141 for english_newline_2 in other_newlines:
142 english_text = english_template % (142 english_text = english_template % (
143 english_newline_1, english_newline_2)143 english_newline_1, english_newline_2)
144 self.assertRaises(144 Sanitizer(english_text)
145 MixedNewlineMarkersError, Sanitizer, english_text)
146145
147 def test_normalizeNewlines_mixed_newlines_translation(self):146 def test_normalizeNewlines_mixed_newlines_translation(self):
148 # Mixed newlines in the translation text will raise an exception.147 # Mixed newlines in the translation text will raise an exception.