Merge lp:~vila/bzr/799212-non-ascii-confs into lp:bzr

Proposed by Vincent Ladeuil
Status: Merged
Approved by: Jelmer Vernooij
Approved revision: no longer in the source branch.
Merged at revision: 5989
Proposed branch: lp:~vila/bzr/799212-non-ascii-confs
Merge into: lp:bzr
Diff against target: 261 lines (+131/-16)
4 files modified
bzrlib/config.py (+21/-11)
bzrlib/errors.py (+9/-0)
bzrlib/tests/test_config.py (+97/-5)
doc/en/release-notes/bzr-2.4.txt (+4/-0)
To merge this branch: bzr merge lp:~vila/bzr/799212-non-ascii-confs
Reviewer Review Type Date Requested Status
Jelmer Vernooij (community) Approve
Review via email: mp+65199@code.launchpad.net

Commit message

Clearer exceptions for bad-content config files

Description of the change

This complements the previous submission by making sure all
config files errors related to a non-utf8 or not parseable
content properly mention the appropriate path/url.

The least obvious part is that TransportConfig() objects have
been involved in some bugs and since it uses ConfigObj directly,
it escaped the exception cathcing that was happening in
IniBasedConfig.

To post a comment you must log in.
Revision history for this message
Jelmer Vernooij (jelmer) :
review: Approve
Revision history for this message
Vincent Ladeuil (vila) wrote :

sent to pqm by email

Revision history for this message
Martin Packman (gz) wrote :

This set of changes is what I had in mind.

As I understand it, bug 502060 is about TransportConfig lacking
try/except, then bug 688677 and bug 797246 are the same non-utf8 issue
in copied code. However I don't understand where bug 799212 comes in
or even what it is really from the report.

One change I'd suggest is making ConfigContentError take the
UnicodeError instance as a second param so the UI has the option of
giving more details like where in the file to look for the bad bytes.

Revision history for this message
Vincent Ladeuil (vila) wrote :

> This set of changes is what I had in mind.

Good.

>
> As I understand it, bug 502060 is about TransportConfig lacking
> try/except,

Yes.

> then bug 688677 and bug 797246 are the same non-utf8 issue
> in copied code.

Not exactly but close enough.

> However I don't understand where bug 799212 comes in
> or even what it is really from the report.

Bah, that's just lp being helpful because I reused the same branch (see lp:bzr, a previous fix has already landed).

>
> One change I'd suggest is making ConfigContentError take the
> UnicodeError instance as a second param so the UI has the option of
> giving more details like where in the file to look for the bad bytes.

I thought about that even tried to point to the right line number.

But whatever trick you may want to do at this point you have to inspect the content of the file again. So you may as well split its content and try to decode each line (which I thought was a bit too much) so adding the UnicodeError doesn't add much (and I didn't want to dive into the various possible errors).

We may revisit that if we get new bug reports from people not getting the clue about the encoding.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bzrlib/config.py'
2--- bzrlib/config.py 2011-06-20 09:31:17 +0000
3+++ bzrlib/config.py 2011-06-20 14:07:30 +0000
4@@ -692,6 +692,8 @@
5 self._parser = ConfigObj(co_input, encoding='utf-8')
6 except configobj.ConfigObjError, e:
7 raise errors.ParseConfigError(e.errors, e.config.filename)
8+ except UnicodeDecodeError:
9+ raise errors.ConfigContentError(self.file_name)
10 # Make sure self.reload() will use the right file name
11 self._parser.filename = self.file_name
12 for hook in OldConfigHooks['load']:
13@@ -1697,6 +1699,8 @@
14 self._config = ConfigObj(self._input, encoding='utf-8')
15 except configobj.ConfigObjError, e:
16 raise errors.ParseConfigError(e.errors, e.config.filename)
17+ except UnicodeError:
18+ raise errors.ConfigContentError(self._filename)
19 return self._config
20
21 def _save(self):
22@@ -2178,12 +2182,21 @@
23 except errors.NoSuchFile:
24 return StringIO()
25
26+ def _external_url(self):
27+ return urlutils.join(self._transport.external_url(), self._filename)
28+
29 def _get_configobj(self):
30 f = self._get_config_file()
31 try:
32- return ConfigObj(f, encoding='utf-8')
33+ try:
34+ conf = ConfigObj(f, encoding='utf-8')
35+ except configobj.ConfigObjError, e:
36+ raise errors.ParseConfigError(e.errors, self._external_url())
37+ except UnicodeDecodeError:
38+ raise errors.ConfigContentError(self._external_url())
39 finally:
40 f.close()
41+ return conf
42
43 def _set_configobj(self, configobj):
44 out_file = StringIO()
45@@ -2285,14 +2298,10 @@
46 """Loads the Store from persistent storage."""
47 raise NotImplementedError(self.load)
48
49- def _load_from_string(self, str_or_unicode):
50+ def _load_from_string(self, bytes):
51 """Create a store from a string in configobj syntax.
52
53- :param str_or_unicode: A string representing the file content. This will
54- be encoded to suit store needs internally.
55-
56- This is for tests and should not be used in production unless a
57- convincing use case can be demonstrated :)
58+ :param bytes: A string representing the file content.
59 """
60 raise NotImplementedError(self._load_from_string)
61
62@@ -2370,21 +2379,22 @@
63 for hook in ConfigHooks['load']:
64 hook(self)
65
66- def _load_from_string(self, str_or_unicode):
67+ def _load_from_string(self, bytes):
68 """Create a config store from a string.
69
70- :param str_or_unicode: A string representing the file content. This will
71- be utf-8 encoded internally.
72+ :param bytes: A string representing the file content.
73 """
74 if self.is_loaded():
75 raise AssertionError('Already loaded: %r' % (self._config_obj,))
76- co_input = StringIO(str_or_unicode)
77+ co_input = StringIO(bytes)
78 try:
79 # The config files are always stored utf8-encoded
80 self._config_obj = ConfigObj(co_input, encoding='utf-8')
81 except configobj.ConfigObjError, e:
82 self._config_obj = None
83 raise errors.ParseConfigError(e.errors, self.external_url())
84+ except UnicodeDecodeError:
85+ raise errors.ConfigContentError(self.external_url())
86
87 def save(self):
88 if not self.is_loaded():
89
90=== modified file 'bzrlib/errors.py'
91--- bzrlib/errors.py 2011-06-02 16:57:09 +0000
92+++ bzrlib/errors.py 2011-06-20 14:07:30 +0000
93@@ -1770,6 +1770,15 @@
94 _fmt = "Working tree has conflicts."
95
96
97+class ConfigContentError(BzrError):
98+
99+ _fmt = "Config file %(filename)s is not UTF-8 encoded\n"
100+
101+ def __init__(self, filename):
102+ BzrError.__init__(self)
103+ self.filename = filename
104+
105+
106 class ParseConfigError(BzrError):
107
108 _fmt = "Error(s) parsing config file %(filename)s:\n%(errors)s"
109
110=== modified file 'bzrlib/tests/test_config.py'
111--- bzrlib/tests/test_config.py 2011-06-20 09:31:17 +0000
112+++ bzrlib/tests/test_config.py 2011-06-20 14:07:30 +0000
113@@ -1871,9 +1871,34 @@
114
115 class TestTransportConfig(tests.TestCaseWithTransport):
116
117+ def test_load_utf8(self):
118+ """Ensure we can load an utf8-encoded file."""
119+ t = self.get_transport()
120+ unicode_user = u'b\N{Euro Sign}ar'
121+ unicode_content = u'user=%s' % (unicode_user,)
122+ utf8_content = unicode_content.encode('utf8')
123+ # Store the raw content in the config file
124+ t.put_bytes('foo.conf', utf8_content)
125+ conf = config.TransportConfig(t, 'foo.conf')
126+ self.assertEquals(unicode_user, conf.get_option('user'))
127+
128+ def test_load_non_ascii(self):
129+ """Ensure we display a proper error on non-ascii, non utf-8 content."""
130+ t = self.get_transport()
131+ t.put_bytes('foo.conf', 'user=foo\n#\xff\n')
132+ conf = config.TransportConfig(t, 'foo.conf')
133+ self.assertRaises(errors.ConfigContentError, conf._get_configobj)
134+
135+ def test_load_erroneous_content(self):
136+ """Ensure we display a proper error on content that can't be parsed."""
137+ t = self.get_transport()
138+ t.put_bytes('foo.conf', '[open_section\n')
139+ conf = config.TransportConfig(t, 'foo.conf')
140+ self.assertRaises(errors.ParseConfigError, conf._get_configobj)
141+
142 def test_get_value(self):
143 """Test that retreiving a value from a section is possible"""
144- bzrdir_config = config.TransportConfig(transport.get_transport('.'),
145+ bzrdir_config = config.TransportConfig(self.get_transport('.'),
146 'control.conf')
147 bzrdir_config.set_option('value', 'key', 'SECTION')
148 bzrdir_config.set_option('value2', 'key2')
149@@ -2328,12 +2353,22 @@
150 self.assertRaises(AssertionError, store._load_from_string, 'bar=baz')
151
152
153-class TestBug799212(TestStore):
154-
155- def test_load_utf8(self):
156+class TestIniFileStoreContent(tests.TestCaseWithTransport):
157+ """Simulate loading a config store without content of various encodings.
158+
159+ All files produced by bzr are in utf8 content.
160+
161+ Users may modify them manually and end up with a file that can't be
162+ loaded. We need to issue proper error messages in this case.
163+ """
164+
165+ invalid_utf8_char = '\xff'
166+
167+ def test_load_utf8(self):
168+ """Ensure we can load an utf8-encoded file."""
169 t = self.get_transport()
170 # From http://pad.lv/799212
171- unicode_user = u'Piotr O\u017carowski'
172+ unicode_user = u'b\N{Euro Sign}ar'
173 unicode_content = u'user=%s' % (unicode_user,)
174 utf8_content = unicode_content.encode('utf8')
175 # Store the raw content in the config file
176@@ -2343,6 +2378,58 @@
177 stack = config.Stack([store.get_sections], store)
178 self.assertEquals(unicode_user, stack.get('user'))
179
180+ def test_load_non_ascii(self):
181+ """Ensure we display a proper error on non-ascii, non utf-8 content."""
182+ t = self.get_transport()
183+ t.put_bytes('foo.conf', 'user=foo\n#%s\n' % (self.invalid_utf8_char,))
184+ store = config.IniFileStore(t, 'foo.conf')
185+ self.assertRaises(errors.ConfigContentError, store.load)
186+
187+ def test_load_erroneous_content(self):
188+ """Ensure we display a proper error on content that can't be parsed."""
189+ t = self.get_transport()
190+ t.put_bytes('foo.conf', '[open_section\n')
191+ store = config.IniFileStore(t, 'foo.conf')
192+ self.assertRaises(errors.ParseConfigError, store.load)
193+
194+
195+class TestIniConfigContent(tests.TestCaseWithTransport):
196+ """Simulate loading a IniBasedConfig without content of various encodings.
197+
198+ All files produced by bzr are in utf8 content.
199+
200+ Users may modify them manually and end up with a file that can't be
201+ loaded. We need to issue proper error messages in this case.
202+ """
203+
204+ invalid_utf8_char = '\xff'
205+
206+ def test_load_utf8(self):
207+ """Ensure we can load an utf8-encoded file."""
208+ # From http://pad.lv/799212
209+ unicode_user = u'b\N{Euro Sign}ar'
210+ unicode_content = u'user=%s' % (unicode_user,)
211+ utf8_content = unicode_content.encode('utf8')
212+ # Store the raw content in the config file
213+ with open('foo.conf', 'wb') as f:
214+ f.write(utf8_content)
215+ conf = config.IniBasedConfig(file_name='foo.conf')
216+ self.assertEquals(unicode_user, conf.get_user_option('user'))
217+
218+ def test_load_badly_encoded_content(self):
219+ """Ensure we display a proper error on non-ascii, non utf-8 content."""
220+ with open('foo.conf', 'wb') as f:
221+ f.write('user=foo\n#%s\n' % (self.invalid_utf8_char,))
222+ conf = config.IniBasedConfig(file_name='foo.conf')
223+ self.assertRaises(errors.ConfigContentError, conf._get_parser)
224+
225+ def test_load_erroneous_content(self):
226+ """Ensure we display a proper error on content that can't be parsed."""
227+ with open('foo.conf', 'wb') as f:
228+ f.write('[open_section\n')
229+ conf = config.IniBasedConfig(file_name='foo.conf')
230+ self.assertRaises(errors.ParseConfigError, conf._get_parser)
231+
232
233 class TestMutableStore(TestStore):
234
235@@ -3052,6 +3139,11 @@
236 self.assertEquals({}, conf._get_config())
237 self._got_user_passwd(None, None, conf, 'http', 'foo.net')
238
239+ def test_non_utf8_config(self):
240+ conf = config.AuthenticationConfig(_file=StringIO(
241+ 'foo = bar\xff'))
242+ self.assertRaises(errors.ConfigContentError, conf._get_config)
243+
244 def test_missing_auth_section_header(self):
245 conf = config.AuthenticationConfig(_file=StringIO('foo = bar'))
246 self.assertRaises(ValueError, conf.get_credentials, 'ftp', 'foo.net')
247
248=== modified file 'doc/en/release-notes/bzr-2.4.txt'
249--- doc/en/release-notes/bzr-2.4.txt 2011-06-20 09:31:17 +0000
250+++ doc/en/release-notes/bzr-2.4.txt 2011-06-20 14:07:30 +0000
251@@ -29,6 +29,10 @@
252 Bug Fixes
253 *********
254
255+* Display a proper error message when a config file content cannot be
256+ decoded as UTF-8 or when it cannot be parsed.
257+ (Vincent Ladeuil, #502060, #688677, #792246)
258+
259 * Properly load utf8-encoded config files. (Vincent Ladeuil, #799212)
260
261 .. Fixes for situations where bzr would previously crash or give incorrect