Merge lp:~whitley/bzr/ignore-exception into lp:bzr
- ignore-exception
- Merge into bzr.dev
Status: | Merged | ||||
---|---|---|---|---|---|
Approved by: | Martin Pool | ||||
Approved revision: | not available | ||||
Merged at revision: | not available | ||||
Proposed branch: | lp:~whitley/bzr/ignore-exception | ||||
Merge into: | lp:bzr | ||||
Diff against target: |
270 lines (+112/-5) 8 files modified
NEWS (+7/-0) bzrlib/builtins.py (+19/-1) bzrlib/globbing.py (+35/-2) bzrlib/help_topics/en/patterns.txt (+3/-0) bzrlib/tests/per_workingtree/test_is_ignored.py (+15/-0) bzrlib/tests/test_globbing.py (+25/-0) bzrlib/tests/test_ignores.py (+4/-0) bzrlib/workingtree.py (+4/-2) |
||||
To merge this branch: | bzr merge lp:~whitley/bzr/ignore-exception | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Martin Pool | Approve | ||
John A Meinel | Approve | ||
Review via email: mp+17151@code.launchpad.net |
Commit message
Description of the change
John Whitley (whitley) wrote : | # |
Martin Pool (mbp) wrote : | # |
That is a nice feature and the patch looks reasonable. I saw there was a thread about benchmarking it to check there was no performance regression - is that verified now?
In the shell examples, you might need to take into account that in some shells doublequotes don't escape !. You might be safer with \!.
Can you please execute the contributor agreement <http://
John A Meinel (jameinel) wrote : | # |
The code looks good. I think the only thing we are waiting on is
1) Running some perf tests of 'bzr status' on a fairly large tree and/or initial import.
a) Create a 50k tree
run bzr init
time bzr add
b) For all files, touch f + '.o'
time bzr st
2) Contributor agreement.
...
115 -
116 - if not pattern.
117 + if not (pattern.
^- What about pattern.
...
204 + patterns = [ u'*', u'!./local', u'!./local/**/*', u'!RE:\
^- our style guide would have this written as:
204 + patterns = [u'*', u'!./local', u'!./local/**/*', u'!RE:\.z.*', u'!!./.zcompdump']
Though you probably also need to watch out for >80 chars (just wrap to the next line as needed). You may also want to add a !!RE: pattern to catch the above issue.
same here:
215 + patterns = [ u'static/
John Whitley (whitley) wrote : | # |
John A Meinel <email address hidden> wrote ..
> Review [...]
Thanks for the review and comments. I'll have the contributor agreement, perf results, and John's suggested changes (plus a few more tests related to same) in ASAP. I've been a bit time-crunched recently but should be able to button this up soon.
John Whitley (whitley) wrote : | # |
Performance notes:
without exceptions in .bzrignore, on the ignore exception working tree:
>> bzrtime "call([
10 loops, best of 3: 286 msec per loop
>> bzrtime "call([
10 loops, best of 3: 289 msec per loop
with '!' and '!!' patterns on the same working tree:
>> bzrtime "call([
10 loops, best of 3: 291 msec per loop
>> bzrtime "call([
10 loops, best of 3: 297 msec per loop
For the above tests, this was added to the tree's .bzrignore:
!./build/
!wombat~
!foobar$$
!RE:nada.*nada
!!./build/
!!./doc/
!!nadabadanada
On my (large, deep) home directory, with ignore exclusion patterns:
>> python -m timeit -s "from subprocess import call, PIPE" "call([
10 loops, best of 3: 278 msec per loop
>> python -m timeit -s "from subprocess import call, PIPE" "call([
10 loops, best of 3: 283 msec per loop
54k file tree test (many copies of exported emacs HEAD tree):
>> time ~/src/ignore-
bzr add 9.13s user 1.16s system 67% cpu 15.266 total
(One cache warming run elided from each set below)
>> time ~/src/bzr/
~/src/bzr/
~/src/bzr/
~/src/bzr/
>> time ~/src/bzr/
~/src/bzr/
~/src/bzr/
~/src/bzr/
With ignore exceptions above added to .bzrignore:
>> time ~/src/bzr/
~/src/bzr/
~/src/bzr/
~/src/bzr/
>> time ~/src/bzr/
~/src/bzr/
~/src/bzr/
~/src/bzr/
The big-tree timings are potentially concerning. It appears that the implementation change incurs a roughly 9.5% hit to bzr status. Adding exception patterns doesn't really change the picture.
If that's too great then my the next step would be a reimplementation of ExcludingGlobster to use the same regex strategy as Globster, but order the pattern types by priority ('!!', '!', then normal pat...
John Whitley (whitley) wrote : | # |
The fix, added tests, and PEP 8 cleanup per John's review have been pushed to this branch.
John A Meinel (jameinel) wrote : | # |
So a 10% hit is a bit of a shame, but not terrible. I also notice that there isn't a huge impact when you actually add the exceptions. Which tells me that it is just the extra calls to Globster.match() (with nothing to match) that are causing the problem.
Looking at the code, Globster.match() should be iterating over self._regex_
Do you care to try an evil hack?
=== modified file 'bzrlib/
--- bzrlib/globbing.py 2010-01-11 16:44:02 +0000
+++ bzrlib/globbing.py 2010-01-14 15:12:34 +0000
@@ -215,6 +215,7 @@
return None
+
class ExceptionGlobst
"""A Globster that supports exception patterns.
@@ -235,9 +236,18 @@
else:
+ self.match = self._match_
+ if not ignores[1]:
+ # If there are no exclusions, then forced inclusions don't need
+ # higher priority
+ ignores[
+ del ignores[2][:]
+ if not ignores[1]:
+ # We don't need any fancy logic, there are only straight 'includes'
+ self.match = self._ignores[
- def match(self, filename):
+ def _match_
:return A matching pattern or None if there is no matching pattern.
@@ -249,6 +259,7 @@
return None
else:
return self._ignores[
+
class _OrderedGlobste
"""A Globster that keeps pattern order."""
(note that you can also get away with just defining "def match()" and then overriding it in __init__, but this is slightly more obvious that we may do that.)
John Whitley (whitley) wrote : | # |
> So a 10% hit is a bit of a shame, but not terrible. I also notice that there
> isn't a huge impact when you actually add the exceptions. Which tells me that
> it is just the extra calls to Globster.match() (with nothing to match) that
> are causing the problem.
Agreed.
> Do you care to try an evil hack?
That's one way, but it still incurs the perf hit when exceptions are in use. To clarify my earlier murmurings, I'm leaning towards an approach where the logical match number can be returned with the results (e.g. by modifying Globster's internals), making ExceptionGlobster a child of the so-modified Globster, and tweaking ExceptionGlobster to 1) assemble the regexes in the correct priority order, 2) preserve range information for the exceptions, and 3) modify match to suit.
In any event, optimizations can be submitted as a subsequent modification.
> (note that you can also get away with just defining "def match()" and then
> overriding it in __init__, but this is slightly more obvious that we may do
> that.)
Ah, nice trick w/ defining match in __init__.
John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
John Whitley wrote:
>> So a 10% hit is a bit of a shame, but not terrible. I also notice that there
>> isn't a huge impact when you actually add the exceptions. Which tells me that
>> it is just the extra calls to Globster.match() (with nothing to match) that
>> are causing the problem.
>
> Agreed.
>
>> Do you care to try an evil hack?
>
> That's one way, but it still incurs the perf hit when exceptions are in use. To clarify my earlier murmurings, I'm leaning towards an approach where the logical match number can be returned with the results (e.g. by modifying Globster's internals), making ExceptionGlobster a child of the so-modified Globster, and tweaking ExceptionGlobster to 1) assemble the regexes in the correct priority order, 2) preserve range information for the exceptions, and 3) modify match to suit.
>
> In any event, optimizations can be submitted as a subsequent modification.
So one thing I was shooting for is that having the feature won't impact
people who aren't using it. Which my version does. Though I've also
worked out a better way.
It is a recognition that we only need to evaluate the "exception" case
if the ignore case matches. And then we only need the double-exception
case when the exception matches (except we also need to check if it
would fit as a regular ignore.)
I pushed this up as lp:~jameinel/bzr/ignore-exception, and have the diff
here.
Can you test the performance hit?
>
>> (note that you can also get away with just defining "def match()" and then
>> overriding it in __init__, but this is slightly more obvious that we may do
>> that.)
>
> Ah, nice trick w/ defining match in __init__.
>
>
v- This is the diff. Sorry to paste it as a quotation, but otherwise my
mail app wraps the lines.
> === modified file 'bzrlib/
> --- bzrlib/globbing.py 2010-01-11 16:44:02 +0000
> +++ bzrlib/globbing.py 2010-01-15 14:37:12 +0000
> @@ -1,4 +1,4 @@
> -# Copyright (C) 2006, 2008 Canonical Ltd
> +# Copyright (C) 2006-2010 Canonical Ltd
>
> # This program is free software; you can redistribute it and/or modify
> # it under the terms of the GNU General Public License as published by
> @@ -215,6 +215,7 @@
> return patterns[
> return None
>
> +
> class ExceptionGlobst
> """A Globster that supports exception patterns.
>
> @@ -235,6 +236,8 @@
> ignores[
> else:
> ignores[
> + # double-exceptions are also simple ignores
> + ignores[
> self._ignores = [Globster(i) for i in ignores]
>
> def match(self, filename):
> @@ -242,13 +245,21 @@
>
> :return A matching pattern or None if there is no matching pattern.
> """
> - double_neg = self._ignores[
> - if double_neg:
> - return "!!%s" % double_neg
> - elif self._ignores[
> - return None
> - else:
> - return self._ignores[
> + m = self._ignores[
> + if m is None:
> + ...
Preview Diff
1 | === modified file 'NEWS' |
2 | --- NEWS 2010-01-11 13:15:01 +0000 |
3 | +++ NEWS 2010-01-11 18:21:15 +0000 |
4 | @@ -17,6 +17,13 @@ |
5 | New Features |
6 | ************ |
7 | |
8 | +* New ignore patterns. Patterns prefixed with '!' are exceptions to |
9 | + ignore patterns and take precedence over regular ignores. Such |
10 | + exceptions are used to specify files that should be versioned which |
11 | + would otherwise be ignored. Patterns prefixed with '!!' act as regular |
12 | + ignore patterns, but have highest precedence, even over the '!' |
13 | + exception patterns. (John Whitley, #428031) |
14 | + |
15 | * Add bug information to log output when available. |
16 | (Neil Martinsen-Burrell, Guillermo Gonzalez, #251729) |
17 | |
18 | |
19 | === modified file 'bzrlib/builtins.py' |
20 | --- bzrlib/builtins.py 2010-01-08 00:05:01 +0000 |
21 | +++ bzrlib/builtins.py 2010-01-11 18:21:15 +0000 |
22 | @@ -2602,6 +2602,13 @@ |
23 | After adding, editing or deleting that file either indirectly by |
24 | using this command or directly by using an editor, be sure to commit |
25 | it. |
26 | + |
27 | + Patterns prefixed with '!' are exceptions to ignore patterns and take |
28 | + precedence over regular ignores. Such exceptions are used to specify |
29 | + files that should be versioned which would otherwise be ignored. |
30 | + |
31 | + Patterns prefixed with '!!' act as regular ignore patterns, but have |
32 | + precedence over the '!' exception patterns. |
33 | |
34 | Note: ignore patterns containing shell wildcards must be quoted from |
35 | the shell on Unix. |
36 | @@ -2611,10 +2618,14 @@ |
37 | |
38 | bzr ignore ./Makefile |
39 | |
40 | - Ignore class files in all directories:: |
41 | + Ignore .class files in all directories...:: |
42 | |
43 | bzr ignore "*.class" |
44 | |
45 | + ...but do not ignore "special.class":: |
46 | + |
47 | + bzr ignore "!special.class" |
48 | + |
49 | Ignore .o files under the lib directory:: |
50 | |
51 | bzr ignore "lib/**/*.o" |
52 | @@ -2626,6 +2637,13 @@ |
53 | Ignore everything but the "debian" toplevel directory:: |
54 | |
55 | bzr ignore "RE:(?!debian/).*" |
56 | + |
57 | + Ignore everything except the "local" toplevel directory, |
58 | + but always ignore "*~" autosave files, even under local/:: |
59 | + |
60 | + bzr ignore "*" |
61 | + bzr ignore "!./local" |
62 | + bzr ignore "!!*~" |
63 | """ |
64 | |
65 | _see_also = ['status', 'ignored', 'patterns'] |
66 | |
67 | === modified file 'bzrlib/globbing.py' |
68 | --- bzrlib/globbing.py 2009-11-11 21:38:02 +0000 |
69 | +++ bzrlib/globbing.py 2010-01-11 18:21:15 +0000 |
70 | @@ -215,6 +215,40 @@ |
71 | return patterns[match.lastindex -1] |
72 | return None |
73 | |
74 | +class ExceptionGlobster(object): |
75 | + """A Globster that supports exception patterns. |
76 | + |
77 | + Exceptions are ignore patterns prefixed with '!'. Exception |
78 | + patterns take precedence over regular patterns and cause a |
79 | + matching filename to return None from the match() function. |
80 | + Patterns using a '!!' prefix are highest precedence, and act |
81 | + as regular ignores. '!!' patterns are useful to establish ignores |
82 | + that apply under paths specified by '!' exception patterns. |
83 | + """ |
84 | + |
85 | + def __init__(self,patterns): |
86 | + ignores = [[], [], []] |
87 | + for p in patterns: |
88 | + if p.startswith(u'!!'): |
89 | + ignores[2].append(p[2:]) |
90 | + elif p.startswith(u'!'): |
91 | + ignores[1].append(p[1:]) |
92 | + else: |
93 | + ignores[0].append(p) |
94 | + self._ignores = [Globster(i) for i in ignores] |
95 | + |
96 | + def match(self, filename): |
97 | + """Searches for a pattern that matches the given filename. |
98 | + |
99 | + :return A matching pattern or None if there is no matching pattern. |
100 | + """ |
101 | + double_neg = self._ignores[2].match(filename) |
102 | + if double_neg: |
103 | + return "!!%s" % double_neg |
104 | + elif self._ignores[1].match(filename): |
105 | + return None |
106 | + else: |
107 | + return self._ignores[0].match(filename) |
108 | |
109 | class _OrderedGlobster(Globster): |
110 | """A Globster that keeps pattern order.""" |
111 | @@ -244,8 +278,7 @@ |
112 | |
113 | Doesn't normalize regular expressions - they may contain escapes. |
114 | """ |
115 | - |
116 | - if not pattern.startswith('RE:'): |
117 | + if not (pattern.startswith('RE:') or pattern.startswith('!RE:')): |
118 | pattern = _slashes.sub('/', pattern) |
119 | if len(pattern) > 1: |
120 | pattern = pattern.rstrip('/') |
121 | |
122 | === modified file 'bzrlib/help_topics/en/patterns.txt' |
123 | --- bzrlib/help_topics/en/patterns.txt 2008-06-25 07:17:14 +0000 |
124 | +++ bzrlib/help_topics/en/patterns.txt 2010-01-11 18:21:15 +0000 |
125 | @@ -23,3 +23,6 @@ |
126 | patterns are identified by a 'RE:' prefix followed by the regular |
127 | expression. Regular expression patterns may not include named or |
128 | numbered groups. |
129 | + |
130 | +Ignore patterns may be prefixed with '!', which means that a filename |
131 | +matched by that pattern will not be ignored. |
132 | |
133 | === modified file 'bzrlib/tests/per_workingtree/test_is_ignored.py' |
134 | --- bzrlib/tests/per_workingtree/test_is_ignored.py 2009-07-10 07:14:02 +0000 |
135 | +++ bzrlib/tests/per_workingtree/test_is_ignored.py 2010-01-11 18:21:15 +0000 |
136 | @@ -31,11 +31,15 @@ |
137 | ('.bzrignore', './rootdir\n' |
138 | 'randomfile*\n' |
139 | '*bar\n' |
140 | + '!bazbar\n' |
141 | '?foo\n' |
142 | '*.~*\n' |
143 | 'dir1/*f1\n' |
144 | 'dir1/?f2\n' |
145 | + 'RE:dir2/.*\.wombat\n' |
146 | 'path/from/ro?t\n' |
147 | + '**/piffle.py\n' |
148 | + '!b/piffle.py\n' |
149 | 'unicode\xc2\xb5\n' # u'\xb5'.encode('utf8') |
150 | 'dos\r\n' |
151 | '\n' # empty line |
152 | @@ -58,6 +62,12 @@ |
153 | self.assertEqual("path/from/ro?t", tree.is_ignored('path/from/root')) |
154 | self.assertEqual("path/from/ro?t", tree.is_ignored('path/from/roat')) |
155 | self.assertEqual(None, tree.is_ignored('roat')) |
156 | + |
157 | + self.assertEqual('**/piffle.py', tree.is_ignored('piffle.py')) |
158 | + self.assertEqual('**/piffle.py', tree.is_ignored('a/piffle.py')) |
159 | + self.assertEqual(None, tree.is_ignored('b/piffle.py')) # exclusion |
160 | + self.assertEqual('**/piffle.py', tree.is_ignored('foo/bar/piffle.py')) |
161 | + self.assertEqual(None, tree.is_ignored('p/iffle.py')) |
162 | |
163 | self.assertEqual(u'unicode\xb5', tree.is_ignored(u'unicode\xb5')) |
164 | self.assertEqual(u'unicode\xb5', tree.is_ignored(u'subdir/unicode\xb5')) |
165 | @@ -72,6 +82,8 @@ |
166 | self.assertEqual('*bar', tree.is_ignored(r'foo\nbar')) |
167 | self.assertEqual('*bar', tree.is_ignored('bar')) |
168 | self.assertEqual('*bar', tree.is_ignored('.bar')) |
169 | + |
170 | + self.assertEqual(None, tree.is_ignored('bazbar')) # exclusion |
171 | |
172 | self.assertEqual('?foo', tree.is_ignored('afoo')) |
173 | self.assertEqual('?foo', tree.is_ignored('.foo')) |
174 | @@ -84,6 +96,9 @@ |
175 | |
176 | self.assertEqual('dir1/?f2', tree.is_ignored('dir1/ff2')) |
177 | self.assertEqual('dir1/?f2', tree.is_ignored('dir1/.f2')) |
178 | + |
179 | + self.assertEqual('RE:dir2/.*\.wombat', tree.is_ignored('dir2/foo.wombat')) |
180 | + self.assertEqual(None, tree.is_ignored('dir2/foo')) |
181 | |
182 | # Blank lines and comments should be ignored |
183 | self.assertEqual(None, tree.is_ignored('')) |
184 | |
185 | === modified file 'bzrlib/tests/test_globbing.py' |
186 | --- bzrlib/tests/test_globbing.py 2009-11-11 21:38:02 +0000 |
187 | +++ bzrlib/tests/test_globbing.py 2010-01-11 18:21:15 +0000 |
188 | @@ -17,6 +17,7 @@ |
189 | |
190 | from bzrlib.globbing import ( |
191 | Globster, |
192 | + ExceptionGlobster, |
193 | _OrderedGlobster, |
194 | normalize_pattern |
195 | ) |
196 | @@ -307,6 +308,30 @@ |
197 | self.assertEqual(patterns[x],globster.match(filename)) |
198 | self.assertEqual(None,globster.match('foobar.300')) |
199 | |
200 | +class TestExceptionGlobster(TestCase): |
201 | + |
202 | + def test_exclusion_patterns(self): |
203 | + """test that exception patterns are not matched""" |
204 | + patterns = [ u'*', u'!./local', u'!./local/**/*', u'!RE:\.z.*',u'!!./.zcompdump' ] |
205 | + globster = ExceptionGlobster(patterns) |
206 | + self.assertEqual(u'*', globster.match('tmp/foo.txt')) |
207 | + self.assertEqual(None, globster.match('local')) |
208 | + self.assertEqual(None, globster.match('local/bin/wombat')) |
209 | + self.assertEqual(None, globster.match('.zshrc')) |
210 | + self.assertEqual(None, globster.match('.zfunctions/fiddle/flam')) |
211 | + self.assertEqual(u'!!./.zcompdump', globster.match('.zcompdump')) |
212 | + |
213 | + def test_exclusion_order(self): |
214 | + """test that ordering of exclusion patterns does not matter""" |
215 | + patterns = [ u'static/**/*.html', u'!static/**/versionable.html'] |
216 | + globster = ExceptionGlobster(patterns) |
217 | + self.assertEqual(u'static/**/*.html', globster.match('static/foo.html')) |
218 | + self.assertEqual(None, globster.match('static/versionable.html')) |
219 | + self.assertEqual(None, globster.match('static/bar/versionable.html')) |
220 | + globster = ExceptionGlobster(reversed(patterns)) |
221 | + self.assertEqual(u'static/**/*.html', globster.match('static/foo.html')) |
222 | + self.assertEqual(None, globster.match('static/versionable.html')) |
223 | + self.assertEqual(None, globster.match('static/bar/versionable.html')) |
224 | |
225 | class TestOrderedGlobster(TestCase): |
226 | |
227 | |
228 | === modified file 'bzrlib/tests/test_ignores.py' |
229 | --- bzrlib/tests/test_ignores.py 2009-03-23 14:59:43 +0000 |
230 | +++ bzrlib/tests/test_ignores.py 2010-01-11 18:21:15 +0000 |
231 | @@ -34,6 +34,8 @@ |
232 | '\n' # empty line |
233 | '#comment\n' |
234 | ' xx \n' # whitespace |
235 | + '!RE:^\.z.*\n' |
236 | + '!!./.zcompdump\n' |
237 | )) |
238 | self.assertEqual(set(['./rootdir', |
239 | 'randomfile*', |
240 | @@ -41,6 +43,8 @@ |
241 | u'unicode\xb5', |
242 | 'dos', |
243 | ' xx ', |
244 | + '!RE:^\.z.*', |
245 | + '!!./.zcompdump', |
246 | ]), ignored) |
247 | |
248 | def test_parse_empty(self): |
249 | |
250 | === modified file 'bzrlib/workingtree.py' |
251 | --- bzrlib/workingtree.py 2009-12-22 05:24:50 +0000 |
252 | +++ bzrlib/workingtree.py 2010-01-11 18:21:15 +0000 |
253 | @@ -1741,13 +1741,15 @@ |
254 | r"""Check whether the filename matches an ignore pattern. |
255 | |
256 | Patterns containing '/' or '\' need to match the whole path; |
257 | - others match against only the last component. |
258 | + others match against only the last component. Patterns starting |
259 | + with '!' are ignore exceptions. Exceptions take precedence |
260 | + over regular patterns and cause the filename to not be ignored. |
261 | |
262 | If the file is ignored, returns the pattern which caused it to |
263 | be ignored, otherwise None. So this can simply be used as a |
264 | boolean if desired.""" |
265 | if getattr(self, '_ignoreglobster', None) is None: |
266 | - self._ignoreglobster = globbing.Globster(self.get_ignore_list()) |
267 | + self._ignoreglobster = globbing.ExceptionGlobster(self.get_ignore_list()) |
268 | return self._ignoreglobster.match(filename) |
269 | |
270 | def kind(self, file_id): |
This is an implementation of ignore exceptions, using a syntax borrowed from git that prefixes ignores with an exclamation point '!'. Per discussion on the list, I've also extended this with a syntax ('!!' patterns) to allow ignores be specified under excepted paths. NEWS, help, and internal comments in this branch should have been updated w.r.t. this addition.
bug: https:/ /bugs.launchpad .net/bzr/ +bug/428031