Bazaar

Merge lp:~jameinel/bzr/2.0.4-pyrex-propagation into lp:bzr/2.0

2.0.4-pyrex-propagation
Merge into 2.0

Proposed by John A Meinel on 2010-01-05

Status:	Merged
Approved by:	Andrew Bennetts on 2010-01-06
Approved revision:	not available
Merged at revision:	not available
Proposed branch:	lp:~jameinel/bzr/2.0.4-pyrex-propagation
Merge into:	lp:bzr/2.0
Diff against target:	322 lines (+83/-21) 12 files modified NEWS (+5/-0) bzrlib/_annotator_pyx.pyx (+8/-1) bzrlib/_btree_serializer_pyx.pyx (+2/-2) bzrlib/_chk_map_pyx.pyx (+1/-1) bzrlib/_dirstate_helpers_pyx.pyx (+11/-9) bzrlib/_groupcompress_pyx.pyx (+1/-1) bzrlib/_knit_load_data_pyx.pyx (+2/-1) bzrlib/_known_graph_pyx.pyx (+1/-1) bzrlib/_rio_pyx.pyx (+1/-1) bzrlib/_walkdirs_win32.pyx (+4/-4) bzrlib/tests/test_selftest.py (+2/-0) bzrlib/tests/test_source.py (+45/-0)
To merge this branch:	bzr merge lp:~jameinel/bzr/2.0.4-pyrex-propagation
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
Andrew Bennetts		2010-01-05	Approve on 2010-01-06
Review via email: mp+16832@code.launchpad.net

Revision history for this message

John A Meinel (jameinel) wrote on 2010-01-05:

This addresses Andrew's follow up about my earlier fix.

It adds a 'test_source' test that all 'cdef' functions have thought about how they want to handle exceptions. Either they

1) Return an object
2) Return a class defined in local scope (which is an object)
3) Have an 'except ?' clause
4) Add "# cannot_raise" (useful for pure C functions that know they cannot raise)

I ran this over the .pyx files and fixed them all to conform. It does catch
+ cdef int _gather_result_for_consistency(self, result) except -1:

and _knit_load_data's
- cdef void validate(self):
+ cdef int validate(self) except -1:

Otherwise, all other cases are 'cannot_raise'. I expect to find a few more in the 2.1 branch, which will require more work than just merging 2.0 => bzr.dev once this lands. I'll create a branch for it soon.

Revision history for this message

Andrew Bennetts (spiv) wrote on 2010-01-06:

This looks good to me. I worry a little bit that the # cannot_raise annotations might become stale over time, but this seems to be as good as we can do without improving Pyrex itself.

review: Approve

Revision history for this message

John A Meinel (jameinel) wrote on 2010-01-06:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Bennetts wrote:
> Review: Approve
> This looks good to me. I worry a little bit that the # cannot_raise annotations might become stale over time, but this seems to be as good as we can do without improving Pyrex itself.

So I have a block that checks if you have both a "cannot_raise" and a
"except :" clause.

But certainly it is possible that *today* I add a 'cannot_raise' and
then in the future someone adds a line that calls into python code, and
silently traps the exception. I think, as you say, the only way forward
is to improve Pyrex/Cython. I should try asking Cython what they think
it would take to implement this.

cannot_raise such that it can only call other cannot_raise functions,
and no pure-python functions. They already have 'nogil' which does some
of that. I wonder if we could cheat and use 'nogil' to get that
functionality.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAktElrAACgkQJdeBCYSNAAOW5QCffahehGhbJeqh8Vz5U9nNBpME
K/gAoNXM/lGOhy+aBj86tRFscLZeUn1S
=+VOe
-----END PGP SIGNATURE-----

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Alexander Belchenko

Bazaar Codereview Subscribers

Benoit Pierre

John A Meinel

Martin Pool

Matt Nordhoff

bzr PQM

pascalprost

Bazaar

Merge lp:~jameinel/bzr/2.0.4-pyrex-propagation into lp:bzr/2.0

Commit message

Description of the change

Preview Diff

Subscribers

1	=== modified file 'NEWS'
2	--- NEWS 2010-01-05 00:01:54 +0000
3	+++ NEWS 2010-01-06 17:31:11 +0000
4	@@ -61,6 +61,11 @@
5	Testing
6	*******
7
8	+* We have a new ``test_source`` that ensures all pyrex ``cdef`` functions
9	+ handle exceptions somehow. (Possibly by setting ``# cannot_raise``
10	+ rather than an ``except ?:`` clause.) This should help prevent bugs like
11	+ bug #495023. (John Arbash Meinel)
12	+
13
14	bzr 2.0.3
15	#########
16
17	=== modified file 'bzrlib/_annotator_pyx.pyx'
18	--- bzrlib/_annotator_pyx.pyx 2009-07-08 17:09:03 +0000
19	+++ bzrlib/_annotator_pyx.pyx 2010-01-06 17:31:11 +0000
20	@@ -83,7 +83,14 @@
21	return 0
22
23
24	-cdef PyObject _next_tuple_entry(object tpl, Py_ssize_t pos):
25	+cdef PyObject _next_tuple_entry(object tpl, Py_ssize_t pos): # cannot_raise
26	+ """Return the next entry from this tuple.
27	+
28	+ :param tpl: The tuple we are investigating, must be a PyTuple
29	+ :param pos: The last item we found. Will be updated to the new position.
30	+
31	+ This cannot raise an exception, as it does no error checking.
32	+ """
33	pos[0] = pos[0] + 1
34	if pos[0] >= PyTuple_GET_SIZE(tpl):
35	return NULL
36
37	=== modified file 'bzrlib/_btree_serializer_pyx.pyx'
38	--- bzrlib/_btree_serializer_pyx.pyx 2009-10-01 20:50:36 +0000
39	+++ bzrlib/_btree_serializer_pyx.pyx 2010-01-06 17:31:11 +0000
40	@@ -1,4 +1,4 @@
41	-# Copyright (C) 2008 Canonical Ltd
42	+# Copyright (C) 2008, 2009 Canonical Ltd
43	#
44	# This program is free software; you can redistribute it and/or modify
45	# it under the terms of the GNU General Public License as published by
46	@@ -54,7 +54,7 @@
47
48
49	# TODO: Find some way to import this from _dirstate_helpers
50	-cdef void* _my_memrchr(void *s, int c, size_t n):
51	+cdef void* _my_memrchr(void *s, int c, size_t n): # cannot_raise
52	# memrchr seems to be a GNU extension, so we have to implement it ourselves
53	# It is not present in any win32 standard library
54	cdef char *pos
55
56	=== modified file 'bzrlib/_chk_map_pyx.pyx'
57	--- bzrlib/_chk_map_pyx.pyx 2009-06-22 12:52:39 +0000
58	+++ bzrlib/_chk_map_pyx.pyx 2010-01-06 17:31:11 +0000
59	@@ -65,7 +65,7 @@
60	_unknown = None
61
62	# We shouldn't just copy this from _dirstate_helpers_pyx
63	-cdef void* _my_memrchr(void *s, int c, size_t n):
64	+cdef void* _my_memrchr(void *s, int c, size_t n): # cannot_raise
65	# memrchr seems to be a GNU extension, so we have to implement it ourselves
66	cdef char *pos
67	cdef char *start
68
69	=== modified file 'bzrlib/_dirstate_helpers_pyx.pyx'
70	--- bzrlib/_dirstate_helpers_pyx.pyx 2010-01-04 20:06:30 +0000
71	+++ bzrlib/_dirstate_helpers_pyx.pyx 2010-01-06 17:31:11 +0000
72	@@ -119,7 +119,7 @@
73	# void memrchr(void s, int c, size_t len)
74
75
76	-cdef void* _my_memrchr(void *s, int c, size_t n):
77	+cdef void* _my_memrchr(void *s, int c, size_t n): # cannot_raise
78	# memrchr seems to be a GNU extension, so we have to implement it ourselves
79	cdef char *pos
80	cdef char *start
81	@@ -165,7 +165,7 @@
82	return PyString_FromStringAndSize(s, size)
83
84
85	-cdef int _is_aligned(void *ptr):
86	+cdef int _is_aligned(void *ptr): # cannot_raise
87	"""Is this pointer aligned to an integer size offset?
88
89	:return: 1 if this pointer is aligned, 0 otherwise.
90	@@ -173,7 +173,7 @@
91	return ((<intptr_t>ptr) & ((sizeof(int))-1)) == 0
92
93
94	-cdef int _cmp_by_dirs(char path1, int size1, char path2, int size2):
95	+cdef int _cmp_by_dirs(char path1, int size1, char path2, int size2): # cannot_raise
96	cdef unsigned char *cur1
97	cdef unsigned char *cur2
98	cdef unsigned char *end1
99	@@ -295,7 +295,7 @@
100
101
102	cdef int _cmp_path_by_dirblock_intern(char *path1, int path1_len,
103	- char *path2, int path2_len):
104	+ char *path2, int path2_len): # cannot_raise
105	"""Compare two paths by what directory they are in.
106
107	see ``_cmp_path_by_dirblock`` for details.
108	@@ -768,7 +768,7 @@
109	state._dirblock_state = DirState.IN_MEMORY_UNMODIFIED
110
111
112	-cdef int minikind_from_mode(int mode):
113	+cdef int minikind_from_mode(int mode): # cannot_raise
114	# in order of frequency:
115	if S_ISREG(mode):
116	return c"f"
117	@@ -915,7 +915,8 @@
118	return link_or_sha1
119
120
121	-cdef char _minikind_from_string(object string):
122	+# TODO: Do we want to worry about exceptions here?
123	+cdef char _minikind_from_string(object string) except? -1:
124	"""Convert a python string to a char."""
125	return PyString_AsString(string)[0]
126
127	@@ -953,7 +954,7 @@
128	raise KeyError(PyString_FromStringAndSize(_minikind, 1))
129
130
131	-cdef int _versioned_minikind(char minikind):
132	+cdef int _versioned_minikind(char minikind): # cannot_raise
133	"""Return non-zero if minikind is in fltd"""
134	return (minikind == c'f' or
135	minikind == c'd' or
136	@@ -1373,7 +1374,7 @@
137	def iter_changes(self):
138	return self
139
140	- cdef void _gather_result_for_consistency(self, result):
141	+ cdef int _gather_result_for_consistency(self, result) except -1:
142	"""Check a result we will yield to make sure we are consistent later.
143
144	This gathers result's parents into a set to output later.
145	@@ -1381,7 +1382,7 @@
146	:param result: A result tuple.
147	"""
148	if not self.partial or not result[0]:
149	- return
150	+ return 0
151	self.seen_ids.add(result[0])
152	new_path = result[1][1]
153	if new_path:
154	@@ -1391,6 +1392,7 @@
155	# Add the root directory which parent_directories does not
156	# provide.
157	self.search_specific_file_parents.add('')
158	+ return 0
159
160	cdef int _update_current_block(self) except -1:
161	if (self.block_index < len(self.state._dirblocks) and
162
163	=== modified file 'bzrlib/_groupcompress_pyx.pyx'
164	--- bzrlib/_groupcompress_pyx.pyx 2009-06-10 03:56:49 +0000
165	+++ bzrlib/_groupcompress_pyx.pyx 2010-01-06 17:31:11 +0000
166	@@ -276,7 +276,7 @@
167
168
169	cdef unsigned char _decode_copy_instruction(unsigned char bytes,
170	- unsigned char cmd, unsigned int offset, unsigned int length):
171	+ unsigned char cmd, unsigned int offset, unsigned int length): # cannot_raise
172	"""Decode a copy instruction from the next few bytes.
173
174	A copy instruction is a variable number of bytes, so we will parse the
175
176	=== modified file 'bzrlib/_knit_load_data_pyx.pyx'
177	--- bzrlib/_knit_load_data_pyx.pyx 2009-06-22 12:52:39 +0000
178	+++ bzrlib/_knit_load_data_pyx.pyx 2010-01-06 17:31:11 +0000
179	@@ -97,11 +97,12 @@
180	self.end_str = NULL
181	self.history_len = 0
182
183	- cdef void validate(self):
184	+ cdef int validate(self) except -1:
185	if not PyDict_CheckExact(self.cache):
186	raise TypeError('kndx._cache must be a python dict')
187	if not PyList_CheckExact(self.history):
188	raise TypeError('kndx._history must be a python list')
189	+ return 0
190
191	cdef object process_options(self, char option_str, char end):
192	"""Process the options string into a list."""
193
194	=== modified file 'bzrlib/_known_graph_pyx.pyx'
195	--- bzrlib/_known_graph_pyx.pyx 2009-09-02 13:32:52 +0000
196	+++ bzrlib/_known_graph_pyx.pyx 2010-01-06 17:31:11 +0000
197	@@ -591,7 +591,7 @@
198	self._revno_first, self._revno_second, self._revno_last,
199	self.is_first_child, self.seen_by_child)
200
201	- cdef int has_pending_parents(self):
202	+ cdef int has_pending_parents(self): # cannot_raise
203	if self.left_pending_parent is not None or self.pending_parents:
204	return 1
205	return 0
206
207	=== modified file 'bzrlib/_rio_pyx.pyx'
208	--- bzrlib/_rio_pyx.pyx 2009-05-15 02:36:03 +0000
209	+++ bzrlib/_rio_pyx.pyx 2010-01-06 17:31:11 +0000
210	@@ -49,7 +49,7 @@
211
212	from bzrlib.rio import Stanza
213
214	-cdef int _valid_tag_char(char c):
215	+cdef int _valid_tag_char(char c): # cannot_raise
216	return (c == c'_' or c == c'-' or
217	(c >= c'a' and c <= c'z') or
218	(c >= c'A' and c <= c'Z') or
219
220	=== modified file 'bzrlib/_walkdirs_win32.pyx'
221	--- bzrlib/_walkdirs_win32.pyx 2009-03-23 14:59:43 +0000
222	+++ bzrlib/_walkdirs_win32.pyx 2010-01-06 17:31:11 +0000
223	@@ -109,7 +109,7 @@
224	wcslen(data.cFileName))
225
226
227	-cdef int _get_mode_bits(WIN32_FIND_DATAW *data):
228	+cdef int _get_mode_bits(WIN32_FIND_DATAW *data): # cannot_raise
229	cdef int mode_bits
230
231	mode_bits = 0100666 # writeable file, the most common
232	@@ -121,13 +121,13 @@
233	return mode_bits
234
235
236	-cdef __int64 _get_size(WIN32_FIND_DATAW *data):
237	+cdef __int64 _get_size(WIN32_FIND_DATAW *data): # cannot_raise
238	# Pyrex casts a DWORD into a PyLong anyway, so it is safe to do << 32
239	# on a DWORD
240	return ((<__int64>data.nFileSizeHigh) << 32) + data.nFileSizeLow
241
242
243	-cdef double _ftime_to_timestamp(FILETIME *ft):
244	+cdef double _ftime_to_timestamp(FILETIME *ft): # cannot_raise
245	"""Convert from a FILETIME struct into a floating point timestamp.
246
247	The fields of a FILETIME structure are the hi and lo part
248	@@ -147,7 +147,7 @@
249	return (val * 1.0e-7) - 11644473600.0
250
251
252	-cdef int _should_skip(WIN32_FIND_DATAW *data):
253	+cdef int _should_skip(WIN32_FIND_DATAW *data): # cannot_raise
254	"""Is this '.' or '..' so we should skip it?"""
255	if (data.cFileName[0] != c'.'):
256	return 0
257
258	=== modified file 'bzrlib/tests/test_selftest.py'
259	--- bzrlib/tests/test_selftest.py 2009-10-26 22:03:28 +0000
260	+++ bzrlib/tests/test_selftest.py 2010-01-06 17:31:11 +0000
261	@@ -892,6 +892,8 @@
262	result.report_unsupported(test, feature)
263	output = result_stream.getvalue()[prefix:]
264	lines = output.splitlines()
265	+ # XXX: This is a timing dependent test. I've had it fail because it
266	+ # took 6ms to evaluate... :(
267	self.assertEqual(lines, ['NODEP 0ms',
268	" The feature 'Feature' is not available."])
269
270
271	=== modified file 'bzrlib/tests/test_source.py'
272	--- bzrlib/tests/test_source.py 2009-09-10 03:44:53 +0000
273	+++ bzrlib/tests/test_source.py 2010-01-06 17:31:11 +0000
274	@@ -369,3 +369,48 @@
275	self.fail(
276	"these files contain an assert statement and should not:\n%s"
277	% '\n'.join(badfiles))
278	+
279	+ def test_extension_exceptions(self):
280	+ """Extension functions should propagate exceptions.
281	+
282	+ Either they should return an object, have an 'except' clause, or have a
283	+ "# cannot_raise" to indicate that we've audited them and defined them as not
284	+ raising exceptions.
285	+ """
286	+ both_exc_and_no_exc = []
287	+ missing_except = []
288	+ class_re = re.compile(r'^(cdef\s+)?class (\w+).*:', re.MULTILINE)
289	+ except_re = re.compile(r'cdef\s*' # start with cdef
290	+ r'([\w ]?)\s*' # this is the return signature
291	+ r'(\w+)\s*\(' # the function name
292	+ r'[^)]\)\s' # parameters
293	+ r'(.)\s:' # the except clause
294	+ r'\s(#\scannot[- _]raise)?' # cannot raise comment
295	+ )
296	+ for fname, text in self.get_source_file_contents(
297	+ extensions=('.pyx',)):
298	+ known_classes = set([m[1] for m in class_re.findall(text)])
299	+ cdefs = except_re.findall(text)
300	+ for sig, func, exc_clause, no_exc_comment in cdefs:
301	+ if not sig or sig in known_classes:
302	+ sig = 'object'
303	+ if exc_clause and no_exc_comment:
304	+ both_exc_and_no_exc.append((fname, func))
305	+ if sig != 'object' and not (exc_clause or no_exc_comment):
306	+ missing_except.append((fname, func))
307	+ error_msg = []
308	+ if both_exc_and_no_exc:
309	+ error_msg.append('The following functions had "cannot raise" comments'
310	+ ' but did have an except clause set:')
311	+ for fname, func in both_exc_and_no_exc:
312	+ error_msg.append('%s:%s' % (fname, func))
313	+ error_msg.extend(('', ''))
314	+ if missing_except:
315	+ error_msg.append('The following functions have fixed return types,'
316	+ ' but no except clause.')
317	+ error_msg.append('Either add an except or append "# cannot_raise".')
318	+ for fname, func in missing_except:
319	+ error_msg.append('%s:%s' % (fname, func))
320	+ error_msg.extend(('', ''))
321	+ if error_msg:
322	+ self.fail('\n'.join(error_msg))