Merge lp:~gz/qbzr/reliable_exception_encoding_686735 into lp:qbzr
- reliable_exception_encoding_686735
- Merge into trunk2a
Status: | Merged |
---|---|
Approved by: | Alexander Belchenko |
Approved revision: | 1272 |
Merged at revision: | 1363 |
Proposed branch: | lp:~gz/qbzr/reliable_exception_encoding_686735 |
Merge into: | lp:qbzr |
Diff against target: |
155 lines (+91/-12) 2 files modified
lib/subprocess.py (+46/-12) lib/tests/test_subprocess.py (+45/-0) |
To merge this branch: | bzr merge lp:~gz/qbzr/reliable_exception_encoding_686735 |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Alexander Belchenko | Approve | ||
Review via email: mp+57973@code.launchpad.net |
Commit message
Description of the change
Posting this evening's code for some feedback. The idea is to make it serialising arbitrary exception instances safer so the user sees something about the actual problem rather than a knock-on failure when weird errors are thrown.
Took a slightly different approach to Alexander's patch. Exception bencoding is now in one function, which:
* Uses the exception name and __dict__ as before
* Like Alexander's patch, switches to using repr on objects not derived from basestring
* Decodes bytestrings as ascii replacing bad characters
* Finally encodes all values as UTF-8 (slightly more compact than unicode-escape)
There are a few tests added for the problem cases I was thinking of, are there any others in need of covering?
Storing both args and the named attributes is slightly annoying as they tend to overlap, but I guess is simple and can actually be useful.
Alexander Belchenko (bialix) wrote : | # |
Alexander Belchenko (bialix) wrote : | # |
Martin, I'm not quite comfortable with bare `except:`, although I see you explicitly catch and re-raise others exceptions. But why do you use bare except here? What kind of errors we might get there? IIUC we shouldn't get UnicodeError, because we try hard to avoid it. Maybe something related to error in repr? Can you explain that in the comments to that except, please?
Otherwise, I'm happy to merge it.
Martin Packman (gz) wrote : | # |
> Martin, I'm not quite comfortable with bare `except:`, although I see you
> explicitly catch and re-raise others exceptions. But why do you use bare
> except here? What kind of errors we might get there? IIUC we shouldn't get
> UnicodeError, because we try hard to avoid it. Maybe something related to
> error in repr? Can you explain that in the comments to that except, please?
So, what I'm not sure about is whether we need to be careful about KeyboardInterrupt under Qt, the rest of the file seemed to worry about it.
The reasoning of the try/except formulation is as follows:
* repr invokes __repr__ which can run arbitrary code, so can raise any exception at all
* Just catching Exception traps KeyboardInterrupt on Python 2.4
* Once we're catching the classes moved to BaseException in 2.5 first, may as well use the bare except which happens to cover string exceptions on versions they're still legal
If KeyboardInterrupt should be being reraised, I should probably add that to the try/except around commands.run_bzr too, while Python 2.3 is still supported.
Is this change based on the right revision, or should I move it to be daggy if you want to land it as a bug fix?
Alexander Belchenko (bialix) wrote : | # |
Martin [gz] пишет:
>> Martin, I'm not quite comfortable with bare `except:`, although I see you
>> explicitly catch and re-raise others exceptions. But why do you use bare
>> except here? What kind of errors we might get there? IIUC we shouldn't get
>> UnicodeError, because we try hard to avoid it. Maybe something related to
>> error in repr? Can you explain that in the comments to that except, please?
>
> So, what I'm not sure about is whether we need to be careful about KeyboardInterrupt under Qt, the rest of the file seemed to worry about it.
>
> The reasoning of the try/except formulation is as follows:
> * repr invokes __repr__ which can run arbitrary code, so can raise any exception at all
> * Just catching Exception traps KeyboardInterrupt on Python 2.4
> * Once we're catching the classes moved to BaseException in 2.5 first, may as well use the bare except which happens to cover string exceptions on versions they're still legal
OK, that's enough for now. I'd like to see this explanation added as
comment in bare except block.
> If KeyboardInterrupt should be being reraised, I should probably add that to the try/except around commands.run_bzr too, while Python 2.3 is still supported.
>
> Is this change based on the right revision, or should I move it to be daggy if you want to land it as a bug fix?
I'm going to cherrypick it to backport the fix to lp:qbzr/0.20.
--
All the dude wanted was his rug back
Martin Packman (gz) wrote : | # |
> OK, that's enough for now. I'd like to see this explanation added as
> comment in bare except block.
Done.
> > If KeyboardInterrupt should be being reraised, I should probably add that to
> the try/except around commands.run_bzr too, while Python 2.3 is still
> supported.
> >
> > Is this change based on the right revision, or should I move it to be daggy
> if you want to land it as a bug fix?
>
> I'm going to cherrypick it to backport the fix to lp:qbzr/0.20.
I've moved the change back in to r1267 as well which looks like it's merge cleanly to any branches to want it on, and fiddled with a couple of other details. If there is anything else you'd like me to do, say. What's the qbzr NEWS policy? It might be easier for you to add it when merging if this is going to multiple places.
Alexander Belchenko (bialix) wrote : | # |
Thank you very much. I update NEWS.
Alexander Belchenko (bialix) wrote : | # |
I'm sorry, but I've just ran the tests and all your three new tests has failed. Can you look into it?
bzr selftest -s bp.qbzr
bzr selftest: C:\Program Files\Bazaar\
C:\Program Files\Bazaar\
bzr-2.3.1 python-2.6.6 Windows-
FAIL: bzrlib.
Text attachment: log
------------
------------
Text attachment: traceback
------------
Traceback (most recent call last):
File "testtools\
File "testtools\
File "C:\work\
File "C:\work\
AssertionError: not equal:
a = ('OSError', {})
b = ('OSError',
{'args': "(13, 'Lupa ev\\xc3\\xa4tty')",
'errno': '13',
'filename': 'None',
'strerror': u'Lupa ev\ufffd\
------------
FAIL: bzrlib.
Text attachment: log
------------
------------
Text attachment: traceback
------------
Traceback (most recent call last):
File "testtools\
File "testtools\
File "C:\work\
File "C:\work\
AssertionError: not equal:
a = ('ValueError', {})
b = ('ValueError', {'args': "('Simple',)"})
------------
FAIL: bzrlib.
Text attachment: log
------------
------------
Text attachment: traceback
------------
Traceback (most recent call last):
File "testtools\
File "testtools\
File "C:\work\
File "C:\work\
AssertionError: not equal:
a = ('ValueError', {})
b = ('ValueError', {'args': '[QBzr could not serialize this attribute]'})
------------
=======
FAIL: bzrlib.
-------
_StringException: Text attachment: log
------------
------------
Text attachment: traceback
------------
Traceback (most recent call last):
File "testtools\
File "testtools\
File "C:\work\
File "C:\work\
AssertionError: not equal:
a = ('OSError', {})
b = ('OSError',
{'args': "(13, 'Lupa ev\\xc3\\xa4tty...
Alexander Belchenko (bialix) wrote : | # |
This is with standalone bzr.exe 2.3.1
Alexander Belchenko (bialix) wrote : | # |
Alexander Belchenko пишет:
> This is with standalone bzr.exe 2.3.1
C:\Python26>
Python 2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> e = ValueError(
>>> e.__dict__
{}
>>> e = OSError(13, "Foo bar")
>>> e.__dict__
{}
>>>
OK, so our code to pass the exception attributes never works as we
think. But I don't understand now why we have that bug.
/me confused
--
All the dude wanted was his rug back
Alexander Belchenko (bialix) wrote : | # |
It seems we should use dir(e) to get the list of the keys, instead of e.__dict__, because in Python 2.5 and 2.6 the latter is always empty dict.
Alexander Belchenko (bialix) wrote : | # |
e.__dict__ works only in Python 2.4 as I can see.
Martin Packman (gz) wrote : | # |
> I'm sorry, but I've just ran the tests and all your three new tests has
> failed. Can you look into it?
Yeay for unittesting catching this before it was merged. Note the fourth test I added last did not fail†. The problem got past me because I assumed the previous attribute selection logic was correct and preserved it, and naturally 2.4 was the version I tested on.
Annoyingly, just switching to `dir(e)` isn't enough, as that includes all attributes on the class and any parent classes which is what I think the original formulation was trying to avoid including.
I'll see if I can find out what's going on and if there's a neater way of doing what qbzr wants.
† Seems that assigning to self does get the attribute in __dict__ which is what BzrError classes tend to do, which is probably why this wasn't noticed before. Getting the attributes of standard Python errors is useful so we do want this code to work for them too.
Alexander Belchenko (bialix) wrote : | # |
Martin [gz] пишет:
>> I'm sorry, but I've just ran the tests and all your three new tests has
>> failed. Can you look into it?
>
> Yeay for unittesting catching this before it was merged. Note the fourth test I added last did not fail†. The problem got past me because I assumed the previous attribute selection logic was correct and preserved it, and naturally 2.4 was the version I tested on.
>
> Annoyingly, just switching to `dir(e)` isn't enough, as that includes all attributes on the class and any parent classes which is what I think the original formulation was trying to avoid including.
I'm not sure what do you mean by "all attributes on the class". At least
for OSError/ValueError we may want to filter out message or maybe args
attribute, I don't know. What BzrError attributes is really importnat
for us?
Looking through the code I see only one place where the errors internals
are used:
inside on_failed method:
if error==
So it seems Gary only wanted to show the display_url for
UncommittedChanges error.
So, if we serialize everything and later use only what we need, then it
does not worth to worry about filtering dir(e), does it?
>
> I'll see if I can find out what's going on and if there's a neater way of doing what qbzr wants.
I'm not quite sure what attributes we want to serialize, maybe Gary can
shed some light on that?
> † Seems that assigning to self does get the attribute in __dict__ which is what BzrError classes tend to do, which is probably why this wasn't noticed before. Getting the attributes of standard Python errors is useful so we do want this code to work for them too.
--
All the dude wanted was his rug back
Alexander Belchenko (bialix) wrote : | # |
So, qbzr is really care only about BzrErrors (today only UncommittdeChanges error used), not about generic errors. So you can rework your tests for ValueError and OSError to reflect this. And please add test for serializing UncommittedChanges error, so we can see it works as expected.
As for generic errors, I'd better serialize them as error class name plus stringified exception (like str(e) or unicode(e) but I'm not sure the latter will work for all errors), and send that to GUI process as `(error_name, {'str': str(e)})`. When we will decide that we need to process something like OSError we will change the serializer accordingly.
Gary van der Merwe (garyvdm) wrote : | # |
On 16/04/2011 16:48, Martin [gz] wrote:
>> Martin, I'm not quite comfortable with bare `except:`, although I
>> see you explicitly catch and re-raise others exceptions. But why do
>> you use bare except here? What kind of errors we might get there?
>> IIUC we shouldn't get UnicodeError, because we try hard to avoid
>> it. Maybe something related to error in repr? Can you explain that
>> in the comments to that except, please?
>
> So, what I'm not sure about is whether we need to be careful about
> KeyboardInterrupt under Qt, the rest of the file seemed to worry
> about it.
>
> The reasoning of the try/except formulation is as follows: * repr
> invokes __repr__ which can run arbitrary code, so can raise any
> exception at all * Just catching Exception traps KeyboardInterrupt on
> Python 2.4 * Once we're catching the classes moved to BaseException
> in 2.5 first, may as well use the bare except which happens to cover
> string exceptions on versions they're still legal
>
> If KeyboardInterrupt should be being reraised, I should probably add
> that to the try/except around commands.run_bzr too, while Python 2.3
> is still supported.
>
> Is this change based on the right revision, or should I move it to be
> daggy if you want to land it as a bug fix?
Note that a KeyboardInterrupt is very likely to happen in the
subprocess. On *nix, when you press cancel, the main process will send a
interrupt signal to the subprocess, which will cause a KeyboardInterrupt
to be raised. (On Windows we do something different.)
I think that it is best to catch all exceptions here. Essentially this
becomes the top level exception handler, and KeyboardInterrupts should
be reported to the main process like any other exception.
Gary
Martin Packman (gz) wrote : | # |
Thanks for continuing to follow up on this Alexander.
> I'm not sure what do you mean by "all attributes on the class". At least
> for OSError/ValueError we may want to filter out message or maybe args
> attribute, I don't know. What BzrError attributes is really importnat
> for us?
The Python 2.5 switch to using descriptors complicates detecting what are attributes of the error, and what are methods or attributes of the class. In practice, I think we can not worry about this because of the below.
> So it seems Gary only wanted to show the display_url for
> UncommittedChanges error.
I've been working on the assumption this information would all end up somewhere visible. If that's not the case, then being much more specific about what attributes need serialising would be better.
> So, qbzr is really care only about BzrErrors (today only UncommittdeChanges
> error used), not about generic errors. So you can rework your tests for
> ValueError and OSError to reflect this. And please add test for serializing
> UncommittedChanges error, so we can see it works as expected.
Okay, so let's change the initial catch as well to only trap BzrError instances for now too, which is less risky as we know all the code involved there.
> As for generic errors, I'd better serialize them as error class name plus
> stringified exception (like str(e) or unicode(e) but I'm not sure the latter
> will work for all errors), and send that to GUI process as `(error_name,
> {'str': str(e)})`. When we will decide that we need to process something like
> OSError we will change the serializer accordingly.
Right, that would also mean I could fix OSError stringificatin in bzrlib and not have to duplicate the code in qbzr.
Martin Packman (gz) wrote : | # |
> I think that it is best to catch all exceptions here. Essentially this
> becomes the top level exception handler, and KeyboardInterrupts should
> be reported to the main process like any other exception.
Thanks for the extra information Gary. I'm still confused about what you actually want to printed with SUB_ERROR though, as the exceptions are reraised anyway, do you want the attributes of everything or just a few specific ones?
Alexander Belchenko (bialix) wrote : | # |
C:\Temp>bzr init-repo 1
Shared repository with trees (format: 1.14)
Location:
shared repository: 1
C:\Temp>cd 1
C:\Temp\1>
C:\Temp\1>bzr init a
Created a repository tree (format: 1.14)
Using shared repository: C:/Temp/1/
C:\Temp\1>cd a
C:\Temp\1\a>
C:\Temp\1\a>bzr mkdir foo
added foo
C:\Temp\1\a>bzr ci -m1
Committing to: C:/Temp/1/a/
added foo
Committed revision 1.
C:\Temp\1\a>cd ..
C:\Temp\1>
C:\Temp\1>bzr branch a b
Branched 1 revision(s).
C:\Temp\1>cd b
C:\Temp\1\b>
C:\Temp\1\b>bzr mkdir spam
added spam
C:\Temp\1\b>bzr ci -m 2
Committing to: C:/Temp/1/b/
added spam
Committed revision 2.
C:\Temp\1\b>cd ../a
C:\Temp\1\a>
C:\Temp\1\a>bzr mkdir bar
added bar
C:\Temp\1\a>bzr qmerge ../b
If you press OK button then `bzr merge ../b` in the subprocess will fail with UncommittedChanges error and the merge dialog will show you additional frame with Commit and Revert buttons to proceed.
Pressing Commit or Revert window qbzr tries to open the tree pointed by `display_url` attribute of the error instance but in fact it will work only for ascii-only paths due to bug 388437, won't it?
So, Gary, Martin, it seems we need real unicode path to the tree/branch to be able then use it to open the tree/branch. But the question is: why do we need to extract this path from exception at all? We're already operating on the specific tree/branch, and we know it's location. The logic of using display_url is not clear for me.
Alexander Belchenko (bialix) wrote : | # |
> Pressing Commit or Revert window qbzr tries to open the tree pointed by
> `display_url` attribute of the error instance but in fact it will work only
> for ascii-only paths due to bug 388437, won't it?
OK, I was wrong. I've just tested it with russian branch name тест and it works as expected, I suppose because actual path is URL file://
- 1271. By Martin Packman
-
As qbzr uses splitlines and bencode doesn't escape newlines do need unicode-escape
- 1272. By Martin Packman
-
Change of approach, just serialise needed attributes rather than introspecting exception instances
Martin Packman (gz) wrote : | # |
Can you test this branch again Alexander? I've junked all the code trying to examine exception instances and gone with a class/attribute whitelist (currently with only one thing) instead.
I agree the use of urlutils.
Alexander Belchenko (bialix) wrote : | # |
Martin [gz] пишет:
> Can you test this branch again Alexander? I've junked all the code trying to examine exception instances and gone with a class/attribute whitelist (currently with only one thing) instead.
He-he, you're very conservative now :-)
> I agree the use of urlutils.
OK, thanks. Testing now.
--
All the dude wanted was his rug back
Alexander Belchenko (bialix) wrote : | # |
Will merge to 0.20 and trunk. Thank you.
Preview Diff
1 | === modified file 'lib/subprocess.py' |
2 | --- lib/subprocess.py 2010-10-21 11:32:48 +0000 |
3 | +++ lib/subprocess.py 2011-04-20 22:36:34 +0000 |
4 | @@ -56,6 +56,7 @@ |
5 | from bzrlib import ( |
6 | bencode, |
7 | commands, |
8 | + errors, |
9 | osutils, |
10 | ui, |
11 | ) |
12 | @@ -673,9 +674,8 @@ |
13 | data = (button == QtGui.QMessageBox.Yes) |
14 | self.process.write(SUB_GETBOOL + bencode.bencode(data) + "\n") |
15 | elif line.startswith(SUB_ERROR): |
16 | - data = bencode.bdecode(line[len(SUB_ERROR):]) |
17 | - self.error_class = data[0] |
18 | - self.error_data = decode_unicode_escape(data[1]) |
19 | + self.error_class, self.error_data = bdecode_exception_instance( |
20 | + line[len(SUB_ERROR):]) |
21 | else: |
22 | line = line.decode(self.encoding, 'replace') |
23 | self.logMessageEx(line, 'plain', self.stdout) |
24 | @@ -886,16 +886,10 @@ |
25 | argv = [unicode(p, 'utf-8') for p in bencode.bdecode(cmd_utf8)] |
26 | try: |
27 | return commands.run_bzr(argv) |
28 | + except (KeyboardInterrupt, SystemExit): |
29 | + raise |
30 | except Exception, e: |
31 | - d = {} |
32 | - for key, val in e.__dict__.iteritems(): |
33 | - if not key.startswith('_'): |
34 | - if not isinstance(val, unicode): |
35 | - val = unicode(val) |
36 | - d[key] = val |
37 | - print "%s%s" % (SUB_ERROR, |
38 | - bencode.bencode((e.__class__.__name__, |
39 | - encode_unicode_escape(d)))) |
40 | + print "%s%s" % (SUB_ERROR, bencode_exception_instance(e)) |
41 | raise |
42 | |
43 | |
44 | @@ -962,6 +956,46 @@ |
45 | def bdecode_prompt(s): |
46 | return bencode.bdecode(s).decode('unicode-escape') |
47 | |
48 | + |
49 | +def bencode_exception_instance(e): |
50 | + """Serialise the main information about an exception instance with bencode |
51 | + |
52 | + For now, nearly all exceptions just give the exception name as a string, |
53 | + but a dictionary is also given that may contain unicode-escaped attributes. |
54 | + """ |
55 | + # GZ 2011-04-15: Could use bzrlib.trace._qualified_exception_name in 2.4 |
56 | + ename = e.__class__.__name__ |
57 | + d = {} |
58 | + # For now be conservative and only serialise attributes that will get used |
59 | + keys = [] |
60 | + if isinstance(e, errors.UncommittedChanges): |
61 | + keys.append("display_url") |
62 | + for key in keys: |
63 | + # getattr and __repr__ can break in lots of ways, so catch everything |
64 | + # but exceptions that occur as interrupts, allowing for Python 2.4 |
65 | + try: |
66 | + val = getattr(e, key) |
67 | + if not isinstance(val, unicode): |
68 | + if not isinstance(val, str): |
69 | + val = repr(val) |
70 | + val = val.decode("ascii", "replace") |
71 | + except (KeyboardInterrupt, SystemExit): |
72 | + raise |
73 | + except: |
74 | + val = "[QBzr could not serialize this attribute]" |
75 | + d[key] = val.encode("unicode-escape") |
76 | + return bencode.bencode((ename, d)) |
77 | + |
78 | + |
79 | +def bdecode_exception_instance(s): |
80 | + """Deserialise information about an exception instance with bdecode""" |
81 | + ename, d = bencode.bdecode(s) |
82 | + for k in d: |
83 | + d[k] = d[k].decode("unicode-escape") |
84 | + return ename, d |
85 | + |
86 | + |
87 | +# GZ 2011-04-15: Remove or deprecate these functions if they remain unused? |
88 | def encode_unicode_escape(obj): |
89 | if isinstance(obj, dict): |
90 | result = {} |
91 | |
92 | === modified file 'lib/tests/test_subprocess.py' |
93 | --- lib/tests/test_subprocess.py 2010-05-24 06:40:05 +0000 |
94 | +++ lib/tests/test_subprocess.py 2011-04-20 22:36:34 +0000 |
95 | @@ -19,11 +19,17 @@ |
96 | # along with this program; if not, write to the Free Software |
97 | # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. |
98 | |
99 | +from bzrlib import ( |
100 | + errors, |
101 | + urlutils, |
102 | + ) |
103 | from bzrlib.tests import TestCase |
104 | from bzrlib.plugins.qbzr.lib.subprocess import ( |
105 | bdecode_prompt, |
106 | bencode_prompt, |
107 | bencode_unicode, |
108 | + bencode_exception_instance, |
109 | + bdecode_exception_instance, |
110 | encode_unicode_escape, |
111 | decode_unicode_escape, |
112 | ) |
113 | @@ -55,3 +61,42 @@ |
114 | def test_decode_unicode_escape_dict(self): |
115 | self.assertEqual({'key': 'foo\nbar', 'ukey': u'\u1234'}, |
116 | decode_unicode_escape({'key': 'foo\\nbar', 'ukey': u'\\u1234'})) |
117 | + |
118 | + |
119 | +class TestExceptionInstanceSerialisation(TestCase): |
120 | + """Check exceptions can serialised safely with needed details preserved""" |
121 | + |
122 | + def check_exception_instance(self, e): |
123 | + encoded = bencode_exception_instance(e) |
124 | + name, attr_dict = bdecode_exception_instance(encoded) |
125 | + self.assertEqual(name, e.__class__.__name__) |
126 | + return attr_dict |
127 | + |
128 | + def test_simple_error(self): |
129 | + """A common error with just an args attribute""" |
130 | + self.check_exception_instance(ValueError("Simple")) |
131 | + # TODO: if transmitted assert args/message in return dict |
132 | + |
133 | + def test_non_ascii_bytes(self): |
134 | + """An error with a non-ascii bytestring attribute""" |
135 | + self.check_exception_instance(OSError(13, "Lupa ev\xc3\xa4tty")) |
136 | + # TODO: if transmitted assert errno/strerror/etc in return dict |
137 | + |
138 | + def test_unreprable_obj(self): |
139 | + """Ensure an object with a broken repr doesn't break the exception""" |
140 | + class Bad(object): |
141 | + def __repr__(self): |
142 | + return self.attribute_that_does_not_exist |
143 | + self.check_exception_instance(ValueError(Bad())) |
144 | + # TODO: if transmitted assert message equals the placeholder string |
145 | + |
146 | + def test_uncommittedchanges_display_url(self): |
147 | + """The display_url of UncommittedChanges errors should be serialised""" |
148 | + path = u"\u1234" |
149 | + class FakeTree(object): |
150 | + def __init__(self, url): |
151 | + self.user_url = url |
152 | + attrs = self.check_exception_instance(errors.UncommittedChanges( |
153 | + FakeTree(urlutils.local_path_to_url(path)))) |
154 | + self.assertIsSameRealPath(path, |
155 | + urlutils.local_path_from_url(attrs["display_url"])) |
I've used unicode_escape to be 200% sure nothing will broke qbzr subprocess interactions. At the time I've switched to unicode_escape I've already twice tried to fix related errors with non-ascii characters, so I've decided that better be safe than sorry. Maybe it was overkill. We can try to fly with your variant now, if new non-ascii bugs wil arise then I'll be forced to switch back to unicode_escape again.
I'm going to review your code tonight. It seems we need new bugfix release right after the fix land.