Merge lp:~jameinel/bzr/2.0.4-pyrex-propagation into lp:bzr/2.0
- 2.0.4-pyrex-propagation
- Merge into 2.0
Status: | Merged |
---|---|
Approved by: | Andrew Bennetts |
Approved revision: | not available |
Merged at revision: | not available |
Proposed branch: | lp:~jameinel/bzr/2.0.4-pyrex-propagation |
Merge into: | lp:bzr/2.0 |
Diff against target: |
322 lines (+83/-21) 12 files modified
NEWS (+5/-0) bzrlib/_annotator_pyx.pyx (+8/-1) bzrlib/_btree_serializer_pyx.pyx (+2/-2) bzrlib/_chk_map_pyx.pyx (+1/-1) bzrlib/_dirstate_helpers_pyx.pyx (+11/-9) bzrlib/_groupcompress_pyx.pyx (+1/-1) bzrlib/_knit_load_data_pyx.pyx (+2/-1) bzrlib/_known_graph_pyx.pyx (+1/-1) bzrlib/_rio_pyx.pyx (+1/-1) bzrlib/_walkdirs_win32.pyx (+4/-4) bzrlib/tests/test_selftest.py (+2/-0) bzrlib/tests/test_source.py (+45/-0) |
To merge this branch: | bzr merge lp:~jameinel/bzr/2.0.4-pyrex-propagation |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Andrew Bennetts | Approve | ||
Review via email: mp+16832@code.launchpad.net |
Commit message
Description of the change
John A Meinel (jameinel) wrote : | # |
Andrew Bennetts (spiv) wrote : | # |
This looks good to me. I worry a little bit that the # cannot_raise annotations might become stale over time, but this seems to be as good as we can do without improving Pyrex itself.
John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Andrew Bennetts wrote:
> Review: Approve
> This looks good to me. I worry a little bit that the # cannot_raise annotations might become stale over time, but this seems to be as good as we can do without improving Pyrex itself.
So I have a block that checks if you have both a "cannot_raise" and a
"except :" clause.
But certainly it is possible that *today* I add a 'cannot_raise' and
then in the future someone adds a line that calls into python code, and
silently traps the exception. I think, as you say, the only way forward
is to improve Pyrex/Cython. I should try asking Cython what they think
it would take to implement this.
cannot_raise such that it can only call other cannot_raise functions,
and no pure-python functions. They already have 'nogil' which does some
of that. I wonder if we could cheat and use 'nogil' to get that
functionality.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAkt
K/gAoNXM/
=+VOe
-----END PGP SIGNATURE-----
Preview Diff
1 | === modified file 'NEWS' |
2 | --- NEWS 2010-01-05 00:01:54 +0000 |
3 | +++ NEWS 2010-01-06 17:31:11 +0000 |
4 | @@ -61,6 +61,11 @@ |
5 | Testing |
6 | ******* |
7 | |
8 | +* We have a new ``test_source`` that ensures all pyrex ``cdef`` functions |
9 | + handle exceptions somehow. (Possibly by setting ``# cannot_raise`` |
10 | + rather than an ``except ?:`` clause.) This should help prevent bugs like |
11 | + bug #495023. (John Arbash Meinel) |
12 | + |
13 | |
14 | bzr 2.0.3 |
15 | ######### |
16 | |
17 | === modified file 'bzrlib/_annotator_pyx.pyx' |
18 | --- bzrlib/_annotator_pyx.pyx 2009-07-08 17:09:03 +0000 |
19 | +++ bzrlib/_annotator_pyx.pyx 2010-01-06 17:31:11 +0000 |
20 | @@ -83,7 +83,14 @@ |
21 | return 0 |
22 | |
23 | |
24 | -cdef PyObject *_next_tuple_entry(object tpl, Py_ssize_t *pos): |
25 | +cdef PyObject *_next_tuple_entry(object tpl, Py_ssize_t *pos): # cannot_raise |
26 | + """Return the next entry from this tuple. |
27 | + |
28 | + :param tpl: The tuple we are investigating, *must* be a PyTuple |
29 | + :param pos: The last item we found. Will be updated to the new position. |
30 | + |
31 | + This cannot raise an exception, as it does no error checking. |
32 | + """ |
33 | pos[0] = pos[0] + 1 |
34 | if pos[0] >= PyTuple_GET_SIZE(tpl): |
35 | return NULL |
36 | |
37 | === modified file 'bzrlib/_btree_serializer_pyx.pyx' |
38 | --- bzrlib/_btree_serializer_pyx.pyx 2009-10-01 20:50:36 +0000 |
39 | +++ bzrlib/_btree_serializer_pyx.pyx 2010-01-06 17:31:11 +0000 |
40 | @@ -1,4 +1,4 @@ |
41 | -# Copyright (C) 2008 Canonical Ltd |
42 | +# Copyright (C) 2008, 2009 Canonical Ltd |
43 | # |
44 | # This program is free software; you can redistribute it and/or modify |
45 | # it under the terms of the GNU General Public License as published by |
46 | @@ -54,7 +54,7 @@ |
47 | |
48 | |
49 | # TODO: Find some way to import this from _dirstate_helpers |
50 | -cdef void* _my_memrchr(void *s, int c, size_t n): |
51 | +cdef void* _my_memrchr(void *s, int c, size_t n): # cannot_raise |
52 | # memrchr seems to be a GNU extension, so we have to implement it ourselves |
53 | # It is not present in any win32 standard library |
54 | cdef char *pos |
55 | |
56 | === modified file 'bzrlib/_chk_map_pyx.pyx' |
57 | --- bzrlib/_chk_map_pyx.pyx 2009-06-22 12:52:39 +0000 |
58 | +++ bzrlib/_chk_map_pyx.pyx 2010-01-06 17:31:11 +0000 |
59 | @@ -65,7 +65,7 @@ |
60 | _unknown = None |
61 | |
62 | # We shouldn't just copy this from _dirstate_helpers_pyx |
63 | -cdef void* _my_memrchr(void *s, int c, size_t n): |
64 | +cdef void* _my_memrchr(void *s, int c, size_t n): # cannot_raise |
65 | # memrchr seems to be a GNU extension, so we have to implement it ourselves |
66 | cdef char *pos |
67 | cdef char *start |
68 | |
69 | === modified file 'bzrlib/_dirstate_helpers_pyx.pyx' |
70 | --- bzrlib/_dirstate_helpers_pyx.pyx 2010-01-04 20:06:30 +0000 |
71 | +++ bzrlib/_dirstate_helpers_pyx.pyx 2010-01-06 17:31:11 +0000 |
72 | @@ -119,7 +119,7 @@ |
73 | # void *memrchr(void *s, int c, size_t len) |
74 | |
75 | |
76 | -cdef void* _my_memrchr(void *s, int c, size_t n): |
77 | +cdef void* _my_memrchr(void *s, int c, size_t n): # cannot_raise |
78 | # memrchr seems to be a GNU extension, so we have to implement it ourselves |
79 | cdef char *pos |
80 | cdef char *start |
81 | @@ -165,7 +165,7 @@ |
82 | return PyString_FromStringAndSize(s, size) |
83 | |
84 | |
85 | -cdef int _is_aligned(void *ptr): |
86 | +cdef int _is_aligned(void *ptr): # cannot_raise |
87 | """Is this pointer aligned to an integer size offset? |
88 | |
89 | :return: 1 if this pointer is aligned, 0 otherwise. |
90 | @@ -173,7 +173,7 @@ |
91 | return ((<intptr_t>ptr) & ((sizeof(int))-1)) == 0 |
92 | |
93 | |
94 | -cdef int _cmp_by_dirs(char *path1, int size1, char *path2, int size2): |
95 | +cdef int _cmp_by_dirs(char *path1, int size1, char *path2, int size2): # cannot_raise |
96 | cdef unsigned char *cur1 |
97 | cdef unsigned char *cur2 |
98 | cdef unsigned char *end1 |
99 | @@ -295,7 +295,7 @@ |
100 | |
101 | |
102 | cdef int _cmp_path_by_dirblock_intern(char *path1, int path1_len, |
103 | - char *path2, int path2_len): |
104 | + char *path2, int path2_len): # cannot_raise |
105 | """Compare two paths by what directory they are in. |
106 | |
107 | see ``_cmp_path_by_dirblock`` for details. |
108 | @@ -768,7 +768,7 @@ |
109 | state._dirblock_state = DirState.IN_MEMORY_UNMODIFIED |
110 | |
111 | |
112 | -cdef int minikind_from_mode(int mode): |
113 | +cdef int minikind_from_mode(int mode): # cannot_raise |
114 | # in order of frequency: |
115 | if S_ISREG(mode): |
116 | return c"f" |
117 | @@ -915,7 +915,8 @@ |
118 | return link_or_sha1 |
119 | |
120 | |
121 | -cdef char _minikind_from_string(object string): |
122 | +# TODO: Do we want to worry about exceptions here? |
123 | +cdef char _minikind_from_string(object string) except? -1: |
124 | """Convert a python string to a char.""" |
125 | return PyString_AsString(string)[0] |
126 | |
127 | @@ -953,7 +954,7 @@ |
128 | raise KeyError(PyString_FromStringAndSize(_minikind, 1)) |
129 | |
130 | |
131 | -cdef int _versioned_minikind(char minikind): |
132 | +cdef int _versioned_minikind(char minikind): # cannot_raise |
133 | """Return non-zero if minikind is in fltd""" |
134 | return (minikind == c'f' or |
135 | minikind == c'd' or |
136 | @@ -1373,7 +1374,7 @@ |
137 | def iter_changes(self): |
138 | return self |
139 | |
140 | - cdef void _gather_result_for_consistency(self, result): |
141 | + cdef int _gather_result_for_consistency(self, result) except -1: |
142 | """Check a result we will yield to make sure we are consistent later. |
143 | |
144 | This gathers result's parents into a set to output later. |
145 | @@ -1381,7 +1382,7 @@ |
146 | :param result: A result tuple. |
147 | """ |
148 | if not self.partial or not result[0]: |
149 | - return |
150 | + return 0 |
151 | self.seen_ids.add(result[0]) |
152 | new_path = result[1][1] |
153 | if new_path: |
154 | @@ -1391,6 +1392,7 @@ |
155 | # Add the root directory which parent_directories does not |
156 | # provide. |
157 | self.search_specific_file_parents.add('') |
158 | + return 0 |
159 | |
160 | cdef int _update_current_block(self) except -1: |
161 | if (self.block_index < len(self.state._dirblocks) and |
162 | |
163 | === modified file 'bzrlib/_groupcompress_pyx.pyx' |
164 | --- bzrlib/_groupcompress_pyx.pyx 2009-06-10 03:56:49 +0000 |
165 | +++ bzrlib/_groupcompress_pyx.pyx 2010-01-06 17:31:11 +0000 |
166 | @@ -276,7 +276,7 @@ |
167 | |
168 | |
169 | cdef unsigned char *_decode_copy_instruction(unsigned char *bytes, |
170 | - unsigned char cmd, unsigned int *offset, unsigned int *length): |
171 | + unsigned char cmd, unsigned int *offset, unsigned int *length): # cannot_raise |
172 | """Decode a copy instruction from the next few bytes. |
173 | |
174 | A copy instruction is a variable number of bytes, so we will parse the |
175 | |
176 | === modified file 'bzrlib/_knit_load_data_pyx.pyx' |
177 | --- bzrlib/_knit_load_data_pyx.pyx 2009-06-22 12:52:39 +0000 |
178 | +++ bzrlib/_knit_load_data_pyx.pyx 2010-01-06 17:31:11 +0000 |
179 | @@ -97,11 +97,12 @@ |
180 | self.end_str = NULL |
181 | self.history_len = 0 |
182 | |
183 | - cdef void validate(self): |
184 | + cdef int validate(self) except -1: |
185 | if not PyDict_CheckExact(self.cache): |
186 | raise TypeError('kndx._cache must be a python dict') |
187 | if not PyList_CheckExact(self.history): |
188 | raise TypeError('kndx._history must be a python list') |
189 | + return 0 |
190 | |
191 | cdef object process_options(self, char *option_str, char *end): |
192 | """Process the options string into a list.""" |
193 | |
194 | === modified file 'bzrlib/_known_graph_pyx.pyx' |
195 | --- bzrlib/_known_graph_pyx.pyx 2009-09-02 13:32:52 +0000 |
196 | +++ bzrlib/_known_graph_pyx.pyx 2010-01-06 17:31:11 +0000 |
197 | @@ -591,7 +591,7 @@ |
198 | self._revno_first, self._revno_second, self._revno_last, |
199 | self.is_first_child, self.seen_by_child) |
200 | |
201 | - cdef int has_pending_parents(self): |
202 | + cdef int has_pending_parents(self): # cannot_raise |
203 | if self.left_pending_parent is not None or self.pending_parents: |
204 | return 1 |
205 | return 0 |
206 | |
207 | === modified file 'bzrlib/_rio_pyx.pyx' |
208 | --- bzrlib/_rio_pyx.pyx 2009-05-15 02:36:03 +0000 |
209 | +++ bzrlib/_rio_pyx.pyx 2010-01-06 17:31:11 +0000 |
210 | @@ -49,7 +49,7 @@ |
211 | |
212 | from bzrlib.rio import Stanza |
213 | |
214 | -cdef int _valid_tag_char(char c): |
215 | +cdef int _valid_tag_char(char c): # cannot_raise |
216 | return (c == c'_' or c == c'-' or |
217 | (c >= c'a' and c <= c'z') or |
218 | (c >= c'A' and c <= c'Z') or |
219 | |
220 | === modified file 'bzrlib/_walkdirs_win32.pyx' |
221 | --- bzrlib/_walkdirs_win32.pyx 2009-03-23 14:59:43 +0000 |
222 | +++ bzrlib/_walkdirs_win32.pyx 2010-01-06 17:31:11 +0000 |
223 | @@ -109,7 +109,7 @@ |
224 | wcslen(data.cFileName)) |
225 | |
226 | |
227 | -cdef int _get_mode_bits(WIN32_FIND_DATAW *data): |
228 | +cdef int _get_mode_bits(WIN32_FIND_DATAW *data): # cannot_raise |
229 | cdef int mode_bits |
230 | |
231 | mode_bits = 0100666 # writeable file, the most common |
232 | @@ -121,13 +121,13 @@ |
233 | return mode_bits |
234 | |
235 | |
236 | -cdef __int64 _get_size(WIN32_FIND_DATAW *data): |
237 | +cdef __int64 _get_size(WIN32_FIND_DATAW *data): # cannot_raise |
238 | # Pyrex casts a DWORD into a PyLong anyway, so it is safe to do << 32 |
239 | # on a DWORD |
240 | return ((<__int64>data.nFileSizeHigh) << 32) + data.nFileSizeLow |
241 | |
242 | |
243 | -cdef double _ftime_to_timestamp(FILETIME *ft): |
244 | +cdef double _ftime_to_timestamp(FILETIME *ft): # cannot_raise |
245 | """Convert from a FILETIME struct into a floating point timestamp. |
246 | |
247 | The fields of a FILETIME structure are the hi and lo part |
248 | @@ -147,7 +147,7 @@ |
249 | return (val * 1.0e-7) - 11644473600.0 |
250 | |
251 | |
252 | -cdef int _should_skip(WIN32_FIND_DATAW *data): |
253 | +cdef int _should_skip(WIN32_FIND_DATAW *data): # cannot_raise |
254 | """Is this '.' or '..' so we should skip it?""" |
255 | if (data.cFileName[0] != c'.'): |
256 | return 0 |
257 | |
258 | === modified file 'bzrlib/tests/test_selftest.py' |
259 | --- bzrlib/tests/test_selftest.py 2009-10-26 22:03:28 +0000 |
260 | +++ bzrlib/tests/test_selftest.py 2010-01-06 17:31:11 +0000 |
261 | @@ -892,6 +892,8 @@ |
262 | result.report_unsupported(test, feature) |
263 | output = result_stream.getvalue()[prefix:] |
264 | lines = output.splitlines() |
265 | + # XXX: This is a timing dependent test. I've had it fail because it |
266 | + # took 6ms to evaluate... :( |
267 | self.assertEqual(lines, ['NODEP 0ms', |
268 | " The feature 'Feature' is not available."]) |
269 | |
270 | |
271 | === modified file 'bzrlib/tests/test_source.py' |
272 | --- bzrlib/tests/test_source.py 2009-09-10 03:44:53 +0000 |
273 | +++ bzrlib/tests/test_source.py 2010-01-06 17:31:11 +0000 |
274 | @@ -369,3 +369,48 @@ |
275 | self.fail( |
276 | "these files contain an assert statement and should not:\n%s" |
277 | % '\n'.join(badfiles)) |
278 | + |
279 | + def test_extension_exceptions(self): |
280 | + """Extension functions should propagate exceptions. |
281 | + |
282 | + Either they should return an object, have an 'except' clause, or have a |
283 | + "# cannot_raise" to indicate that we've audited them and defined them as not |
284 | + raising exceptions. |
285 | + """ |
286 | + both_exc_and_no_exc = [] |
287 | + missing_except = [] |
288 | + class_re = re.compile(r'^(cdef\s+)?class (\w+).*:', re.MULTILINE) |
289 | + except_re = re.compile(r'cdef\s*' # start with cdef |
290 | + r'([\w *]*?)\s*' # this is the return signature |
291 | + r'(\w+)\s*\(' # the function name |
292 | + r'[^)]*\)\s*' # parameters |
293 | + r'(.*)\s*:' # the except clause |
294 | + r'\s*(#\s*cannot[- _]raise)?' # cannot raise comment |
295 | + ) |
296 | + for fname, text in self.get_source_file_contents( |
297 | + extensions=('.pyx',)): |
298 | + known_classes = set([m[1] for m in class_re.findall(text)]) |
299 | + cdefs = except_re.findall(text) |
300 | + for sig, func, exc_clause, no_exc_comment in cdefs: |
301 | + if not sig or sig in known_classes: |
302 | + sig = 'object' |
303 | + if exc_clause and no_exc_comment: |
304 | + both_exc_and_no_exc.append((fname, func)) |
305 | + if sig != 'object' and not (exc_clause or no_exc_comment): |
306 | + missing_except.append((fname, func)) |
307 | + error_msg = [] |
308 | + if both_exc_and_no_exc: |
309 | + error_msg.append('The following functions had "cannot raise" comments' |
310 | + ' but did have an except clause set:') |
311 | + for fname, func in both_exc_and_no_exc: |
312 | + error_msg.append('%s:%s' % (fname, func)) |
313 | + error_msg.extend(('', '')) |
314 | + if missing_except: |
315 | + error_msg.append('The following functions have fixed return types,' |
316 | + ' but no except clause.') |
317 | + error_msg.append('Either add an except or append "# cannot_raise".') |
318 | + for fname, func in missing_except: |
319 | + error_msg.append('%s:%s' % (fname, func)) |
320 | + error_msg.extend(('', '')) |
321 | + if error_msg: |
322 | + self.fail('\n'.join(error_msg)) |
This addresses Andrew's follow up about my earlier fix.
It adds a 'test_source' test that all 'cdef' functions have thought about how they want to handle exceptions. Either they
1) Return an object
2) Return a class defined in local scope (which is an object)
3) Have an 'except ?' clause
4) Add "# cannot_raise" (useful for pure C functions that know they cannot raise)
I ran this over the .pyx files and fixed them all to conform. It does catch result_ for_consistency (self, result) except -1:
+ cdef int _gather_
and _knit_load_data's
- cdef void validate(self):
+ cdef int validate(self) except -1:
Otherwise, all other cases are 'cannot_raise'. I expect to find a few more in the 2.1 branch, which will require more work than just merging 2.0 => bzr.dev once this lands. I'll create a branch for it soon.