Merge lp:~parthm/bzr/300062-invalid-pattern-warnings into lp:bzr

Proposed by Parth Malwankar
Status: Rejected
Rejected by: Robert Collins
Proposed branch: lp:~parthm/bzr/300062-invalid-pattern-warnings
Merge into: lp:bzr
Diff against target: 680 lines (+383/-39)
12 files modified
NEWS (+5/-0)
bzrlib/builtins.py (+10/-0)
bzrlib/errors.py (+1/-1)
bzrlib/globbing.py (+143/-28)
bzrlib/lazy_regex.py (+6/-6)
bzrlib/tests/blackbox/test_add.py (+28/-0)
bzrlib/tests/blackbox/test_ignore.py (+32/-0)
bzrlib/tests/blackbox/test_ignored.py (+15/-0)
bzrlib/tests/blackbox/test_ls.py (+23/-0)
bzrlib/tests/blackbox/test_status.py (+69/-0)
bzrlib/tests/test_globbing.py (+41/-4)
bzrlib/tests/test_ignores.py (+10/-0)
To merge this branch: bzr merge lp:~parthm/bzr/300062-invalid-pattern-warnings
Reviewer Review Type Date Requested Status
John A Meinel Needs Fixing
Robert Collins Pending
bzr-core Pending
Review via email: mp+26809@code.launchpad.net

Commit message

Better handling for invalid patterns in .bzrignore and user ignore. ignore command rejects bad patterns passed to it.

Description of the change

=== Fixed Bug #300062 ===

Based on the earlier discussion
https://code.edge.launchpad.net/~parthm/bzr/300062-better-handling-for-invalid-ignore-pattern/+merge/26400
this branch make bzr issue a warning if there are invalid patterns in any ignore file. 'ls', 'add', 'ignore', 'ignored' and 'status' work as expected. Below is a sample interaction of these commands.

[aaa]% cat .bzrignore
RE:*.cpp
foo
[aaa]% ~/src/bzr.dev/300062-invalid-pattern-warnings/bzr ignore xyz "RE:[[[[[" pqr
bzr: warning: Skipped invalid pattern argument(s):
  RE:[[[[[
bzr: warning: Skipped invalid pattern(s):
  RE:*.cpp
Please fix above patterns in ".bzrignore" or "/home/parthm/.bazaar/ignore".
Bazaar commands will NOT ignore files meant to match above pattern(s).
[aaa]% cat .bzrignore
RE:*.cpp
foo
xyz
pqr
[aaa]%

=================================================================================
[aaa]% ~/src/bzr.dev/300062-invalid-pattern-warnings/bzr st -S
 M .bzrignore
bzr: warning: Skipped invalid pattern(s):
  RE:*.cpp
Please fix above patterns in ".bzrignore" or "/home/parthm/.bazaar/ignore".
Bazaar commands will NOT ignore files meant to match above pattern(s).
? one.txt
? two.txt
? x
? y
[aaa]% ~/src/bzr.dev/300062-invalid-pattern-warnings/bzr st
bzr: warning: Skipped invalid pattern(s):
  RE:*.cpp
Please fix above patterns in ".bzrignore" or "/home/parthm/.bazaar/ignore".
Bazaar commands will NOT ignore files meant to match above pattern(s).
modified:
  .bzrignore
unknown:
  one.txt
  two.txt
  x
  y

=================================================================================
[aaa]% ~/src/bzr.dev/300062-invalid-pattern-warnings/bzr ls -i
.bzrignore.~1~
bzr: warning: Skipped invalid pattern(s):
  RE:*.cpp
Please fix above patterns in ".bzrignore" or "/home/parthm/.bazaar/ignore".
Bazaar commands will NOT ignore files meant to match above pattern(s).
[aaa]% ~/src/bzr.dev/300062-invalid-pattern-warnings/bzr ls -u
bzr: warning: Skipped invalid pattern(s):
  RE:*.cpp
Please fix above patterns in ".bzrignore" or "/home/parthm/.bazaar/ignore".
Bazaar commands will NOT ignore files meant to match above pattern(s).
one.txt
two.txt
x
y

=================================================================================
[aaa]% ~/src/bzr.dev/300062-invalid-pattern-warnings/bzr ignored
.bzrignore.~1~ *~
bzr: warning: Skipped invalid pattern(s):
  RE:*.cpp
Please fix above patterns in ".bzrignore" or "/home/parthm/.bazaar/ignore".
Bazaar commands will NOT ignore files meant to match above pattern(s).

=================================================================================
[aaa]% ~/src/bzr.dev/300062-invalid-pattern-warnings/bzr add
bzr: warning: Skipped invalid pattern(s):
  RE:*.cpp
Please fix above patterns in ".bzrignore" or "/home/parthm/.bazaar/ignore".
Bazaar commands will NOT ignore files meant to match above pattern(s).
adding one.txt
adding two.txt
adding x
adding y

To post a comment you must log in.
Revision history for this message
John A Meinel (jameinel) wrote :

110 + if validp:
111 + g = Globster(validp)
112 + new_regex_patterns.append(g.regex_patterns[0])

^- This seems odd. A comment here saying that we are collapsing the valid requests into a single compiled regex would be useful.

158 +def is_pattern_valid(pattern):

^- It seems like it would be nice if this could share the code that determines what translation unit to use. Otherwise it seems likely to get skew if one of them is updated. For example, the proposed 'FixedString' prefix. The fallout is that it would normally work fine, but if someone had a *different* invalid regex, then suddenly it would start causing this one to fail as well.

I don't understand why you changed variables from being private. In lazy_regex you renamed _real_re_compile to real_re_compile, and in globbing you changed _regex_patterns.

It is perfectly fine for *tests* to check private variables, it is good to keep them private so that plugins get a feeling that they may not be stable names.

...

342 + def test_invalid_pattern_in_ignore_file(self):
343 + """Existing invalid patterns should be shown.
344 + """
345 + tree = self.make_branch_and_tree('.')
346 + self.build_tree(['file', 'foo.c'])
347 + f = open('.bzrignore', 'w')
348 + try:
349 + f.write(u"*.c\nRE:[\n") # invalid pattern
350 + finally:
351 + f.close()

^- I think you can simplify this to:
self.build_tree_contents([('.bzrignore', '*.c\nRE:[\n')])

It is pretty bad form to write *unicode* strings to f.write() given that in python2 it is an 8-bit request. (It happens to be automatically casting your unicode string down to an ascii string.)

This test might be easier to read if we turned it into a "run_script" test, but that may not be easily available here.

The same comments apply to test_add_with_invalid_ignore_in_global_ignore and
test_invalid_pattern_in_ignore_file in the 2 other places.

review: Needs Fixing
Revision history for this message
Parth Malwankar (parthm) wrote :

Thanks for the review John.

> 110 + if validp:
> 111 + g = Globster(validp)
> 112 + new_regex_patterns.append(g.regex_patterns[0])
>
> ^- This seems odd. A comment here saying that we are collapsing the valid
> requests into a single compiled regex would be useful.
>

Done.

> 158 +def is_pattern_valid(pattern):
>
> ^- It seems like it would be nice if this could share the code that determines
> what translation unit to use. Otherwise it seems likely to get skew if one of
> them is updated. For example, the proposed 'FixedString' prefix. The fallout
> is that it would normally work fine, but if someone had a *different* invalid
> regex, then suddenly it would start causing this one to fail as well.
>

I have pulled out the common code into a class _Pattern. _Pattern has an
identify method that categorizes the pattern as fullpath, extension or
basename. This also provides pattern->translator and pattern->prefix
mapping as there are required in multiple places. Unit test is also added
for this.

>
> I don't understand why you changed variables from being private. In lazy_regex
> you renamed _real_re_compile to real_re_compile, and in globbing you changed
> _regex_patterns.
>

real_re_compile is imported by bzrlib.globbing as is_pattern_valid uses
real_re_compile to check if a pattern is valid. Hence I made that public.

For regex_patterns, in case of a pre-existing bad pattern Globbing.match
creates an instance of Globbing and accesses regex_patterns directly as
instance.regex_patterns. Considering that this uses is within the same
file we could allow it to be private but I thought making it public
may be cleaner.

Let me know if the above seems reasonable to you. I haven't done any
changes for this yet.

> It is perfectly fine for *tests* to check private variables, it is good to
> keep them private so that plugins get a feeling that they may not be stable
> names.
>
>
> ...
>
> 342 + def test_invalid_pattern_in_ignore_file(self):
> 343 + """Existing invalid patterns should be shown.
> 344 + """
> 345 + tree = self.make_branch_and_tree('.')
> 346 + self.build_tree(['file', 'foo.c'])
> 347 + f = open('.bzrignore', 'w')
> 348 + try:
> 349 + f.write(u"*.c\nRE:[\n") # invalid pattern
> 350 + finally:
> 351 + f.close()
>
> ^- I think you can simplify this to:
> self.build_tree_contents([('.bzrignore', '*.c\nRE:[\n')])
>

Thanks. That very handy. I wasn't aware of this method.
The tests have been updated to use this.

> It is pretty bad form to write *unicode* strings to f.write() given that in
> python2 it is an 8-bit request. (It happens to be automatically casting your
> unicode string down to an ascii string.)
>
> This test might be easier to read if we turned it into a "run_script" test, hha
> but that may not be easily available here.
>
> The same comments apply to test_add_with_invalid_ignore_in_global_ignore and
> test_invalid_pattern_in_ignore_file in the 2 other places.

Revision history for this message
Robert Collins (lifeless) wrote :

I have to say, I still think that this would be better solved by just
making the compile error clearer and having it throw an external
BZRError - no expansion of interfaces needed, the improvements can be
used by other things using regexps etc.

-Rob

Revision history for this message
Parth Malwankar (parthm) wrote :

Well, we now have both solutions in place. The warnings based (this) and the error based[1] though the latter may need some cleanup and review.

My personal preference is towards an error based approach. I feel warnings are may be ok for operations like 'status' that don't change anything, but specifically for something like 'add' an error would probably be better so that the user doesn't miss the warning message and go ahead and do the commit. We saw The user could 'revert' but it may be a bit of a pain to do that, or worse, user may miss the message and end up with multiple commits after that containing large binary files :) The 'add' operation with warnings specifically reminds me of bug #549310 where whoami was guessing and I ended up with multiple commits with a bad username[2]. I would prefer a uniform error based approach as Robert suggested.
I don't expect the rollout of an error based approach to be a problem as bzr currently crashes on bad patterns so the existing ignore files will not have bad patterns in them.

That said, I think both warning and error based approach are an improvement over the current situation where bzr shows a stack trace which can scare users. It would be great if we can align on one of these after some more discussion and go ahead.

[1] https://code.edge.launchpad.net/~parthm/bzr/300062-better-handling-for-invalid-ignore-pattern/+merge/26400
[2] https://bugs.launchpad.net/bzr/+bug/549310/comments/3

Revision history for this message
Robert Collins (lifeless) wrote :

So, lets go with an error based approach for now, for the following reasons:
 - its a very small change compared to the current code (wrap the
regex compiler to know its taking a list in and print the errorneous
pattern out)
 - it seems genuinely better for some use cases like add
 - we can add a warning based one later if this feels insufficient.

In general I do think its better to stop early and cleanly in commands
that can output lots of stuff - if we show a lot then users will often
miss the fine print.

I'll review the error based one again to move things forward.

Revision history for this message
Robert Collins (lifeless) wrote :

Marking this one as rejected for now.

Revision history for this message
Martin Pool (mbp) wrote :

On 21 June 2010 16:46, Robert Collins <email address hidden> wrote:
> So, lets go with an error based approach for now, for the following reasons:
>  - its a very small change compared to the current code (wrap the
> regex compiler to know its taking a list in and print the errorneous
> pattern out)
>  - it seems genuinely better for some use cases like add
>  - we can add a warning based one later if this feels insufficient.
>
> In general I do think its better to stop early and cleanly in commands
> that can output lots of stuff - if we show a lot then users will often
> miss the fine print.
>
> I'll review the error based one again to move things forward.

I said previously that I slightly preferred a warning if it was
equally easy to implement. But for the sake of simplicity and of
getting an error in cases where we do need it, an error is fine with
me.

--
Martin

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Martin Pool wrote:
> On 21 June 2010 16:46, Robert Collins <email address hidden> wrote:
>> So, lets go with an error based approach for now, for the following reasons:
>> - its a very small change compared to the current code (wrap the
>> regex compiler to know its taking a list in and print the errorneous
>> pattern out)
>> - it seems genuinely better for some use cases like add
>> - we can add a warning based one later if this feels insufficient.
>>
>> In general I do think its better to stop early and cleanly in commands
>> that can output lots of stuff - if we show a lot then users will often
>> miss the fine print.
>>
>> I'll review the error based one again to move things forward.
>
> I said previously that I slightly preferred a warning if it was
> equally easy to implement. But for the sake of simplicity and of
> getting an error in cases where we do need it, an error is fine with
> me.
>

We've had a bad habit in the past of falling over on trivial details and
preventing users from getting stuff done. I'm not saying that is the
case here, and as always there is a balance to be struck.

As mentioned, this can't be considered a regression, since we used to
raise an unhelpful exception, so raising a helpful one is strictly
better than what we had.

So certainly, land the update (I have not reviewed the code). I'm not as
convinced that failing early is always the best answer, it is probably
ok here.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwg10gACgkQJdeBCYSNAAPBLwCgokPBGQCdKuxiom9uOmK2DKmN
Q0UAoKYen2O25wLOuQRdvqyUVRV42iLn
=Qoef
-----END PGP SIGNATURE-----

Unmerged revisions

5292. By Parth Malwankar

added test for _Pattern

5291. By Parth Malwankar

cleanup tests.

5290. By Parth Malwankar

cleanup of pattern categorization code

5289. By Parth Malwankar

merged in changes from trunk

5288. By Parth Malwankar

added _PatternType class to categorize patterns.

5287. By Parth Malwankar

updated NEWS

5286. By Parth Malwankar

merged in changes from trunk.

5285. By Parth Malwankar

added invalid patterns tests for 'ignore'

5284. By Parth Malwankar

added invalid pattern test for ignored.

5283. By Parth Malwankar

added invalid pattern tests for 'add' and 'ls'

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'NEWS'
--- NEWS 2010-06-18 11:57:13 +0000
+++ NEWS 2010-06-19 13:29:28 +0000
@@ -56,6 +56,11 @@
56 test that all commands available to the test suite have help.56 test that all commands available to the test suite have help.
57 (Robert Collins, #177500)57 (Robert Collins, #177500)
5858
59* Invalid patterns in ignore files are now displayed as a warning to the
60 user. Bazaar does not throw a stack trace when an invalid pattern is
61 encountered by commands.
62 (Parth Malwankar, #300062)
63
59* Raise ValueError instead of a string exception.64* Raise ValueError instead of a string exception.
60 (John Arbash Meinel, #586926)65 (John Arbash Meinel, #586926)
6166
6267
=== modified file 'bzrlib/builtins.py'
--- bzrlib/builtins.py 2010-06-17 08:53:15 +0000
+++ bzrlib/builtins.py 2010-06-19 13:29:28 +0000
@@ -2717,6 +2717,16 @@
2717 (len(name_pattern) > 1 and name_pattern[1] == ':')):2717 (len(name_pattern) > 1 and name_pattern[1] == ':')):
2718 raise errors.BzrCommandError(2718 raise errors.BzrCommandError(
2719 "NAME_PATTERN should not be an absolute path")2719 "NAME_PATTERN should not be an absolute path")
2720 name_pattern_list, invalid_patterns = \
2721 globbing.filter_invalid_patterns(name_pattern_list)
2722 if invalid_patterns:
2723 pats = '\n '.join(invalid_patterns)
2724 ui.ui_factory.show_warning(
2725 'Skipped invalid pattern argument(s):\n %s' % pats)
2726 if not name_pattern_list:
2727 # After removing invalid patterns, the list may be
2728 # empty. There is nothing to do in this case.
2729 return
2720 tree, relpath = WorkingTree.open_containing(directory)2730 tree, relpath = WorkingTree.open_containing(directory)
2721 ignores.tree_ignores_add_patterns(tree, name_pattern_list)2731 ignores.tree_ignores_add_patterns(tree, name_pattern_list)
2722 ignored = globbing.Globster(name_pattern_list)2732 ignored = globbing.Globster(name_pattern_list)
27232733
=== modified file 'bzrlib/errors.py'
--- bzrlib/errors.py 2010-06-17 11:52:13 +0000
+++ bzrlib/errors.py 2010-06-19 13:29:28 +0000
@@ -3149,10 +3149,10 @@
3149 def __init__(self, bzrdir):3149 def __init__(self, bzrdir):
3150 self.bzrdir = bzrdir3150 self.bzrdir = bzrdir
31513151
3152
3152class NoWhoami(BzrError):3153class NoWhoami(BzrError):
31533154
3154 _fmt = ('Unable to determine your name.\n'3155 _fmt = ('Unable to determine your name.\n'
3155 "Please, set your name with the 'whoami' command.\n"3156 "Please, set your name with the 'whoami' command.\n"
3156 'E.g. bzr whoami "Your Name <name@example.com>"')3157 'E.g. bzr whoami "Your Name <name@example.com>"')
31573158
3158
31593159
=== modified file 'bzrlib/globbing.py'
--- bzrlib/globbing.py 2010-02-17 17:11:16 +0000
+++ bzrlib/globbing.py 2010-06-19 13:29:28 +0000
@@ -22,6 +22,12 @@
2222
23import re23import re
2424
25import bzrlib
26from bzrlib import (
27 config,
28 ui,
29 )
30from bzrlib.lazy_regex import real_re_compile
25from bzrlib.trace import (31from bzrlib.trace import (
26 warning32 warning
27 )33 )
@@ -152,6 +158,47 @@
152 return _sub_basename(pattern[2:])158 return _sub_basename(pattern[2:])
153159
154160
161class _Pattern(object):
162 """Wrapper for managing pattern categories.
163 """
164 TYPE_FULLPATH = 1
165 TYPE_BASENAME = 2
166 TYPE_EXTENSION = 3
167
168 translators = {
169 TYPE_FULLPATH : _sub_fullpath,
170 TYPE_BASENAME : _sub_basename,
171 TYPE_EXTENSION : _sub_extension,
172 }
173
174 # Prefixes used to combine various patterns.
175 # See: Globster._add_patterns
176 PREFIX_EXTENSION = r'(?:.*/)?(?!.*/)(?:.*\.)'
177 PREFIX_BASENAME = r'(?:.*/)?(?!.*/)'
178 PREFIX_FULLPATH = r''
179
180 prefixes = {
181 TYPE_FULLPATH : PREFIX_FULLPATH,
182 TYPE_BASENAME : PREFIX_BASENAME,
183 TYPE_EXTENSION : PREFIX_EXTENSION,
184 }
185
186 @staticmethod
187 def identify(pattern):
188 """Returns pattern category.
189
190 :param pattern: normalized pattern.
191 Identify if a pattern is fullpath, basename or extension
192 and returns the appropriate type.
193 """
194 if pattern.startswith(u'RE:') or u'/' in pattern:
195 return _Pattern.TYPE_FULLPATH
196 elif pattern.startswith(u'*.'):
197 return _Pattern.TYPE_EXTENSION
198 else:
199 return _Pattern.TYPE_BASENAME
200
201
155class Globster(object):202class Globster(object):
156 """A simple wrapper for a set of glob patterns.203 """A simple wrapper for a set of glob patterns.
157204
@@ -164,7 +211,7 @@
164 a super-regex containing groups of up to 99 patterns.211 a super-regex containing groups of up to 99 patterns.
165 The 99 limitation is due to the grouping limit of the Python re module.212 The 99 limitation is due to the grouping limit of the Python re module.
166 The resulting super-regex and associated patterns are stored as a list of213 The resulting super-regex and associated patterns are stored as a list of
167 (regex,[patterns]) in _regex_patterns.214 (regex,[patterns]) in regex_patterns.
168215
169 For performance reasons the patterns are categorised as extension patterns216 For performance reasons the patterns are categorised as extension patterns
170 (those that match against a file extension), basename patterns217 (those that match against a file extension), basename patterns
@@ -178,46 +225,83 @@
178 patterns.225 patterns.
179 """226 """
180 def __init__(self, patterns):227 def __init__(self, patterns):
181 self._regex_patterns = []228 self.regex_patterns = []
182 path_patterns = []229 path_patterns = []
183 base_patterns = []230 base_patterns = []
184 ext_patterns = []231 ext_patterns = []
232 pattern_lists = {
233 _Pattern.TYPE_FULLPATH : path_patterns,
234 _Pattern.TYPE_EXTENSION : ext_patterns,
235 _Pattern.TYPE_BASENAME : base_patterns,
236 }
185 for pat in patterns:237 for pat in patterns:
186 pat = normalize_pattern(pat)238 pat = normalize_pattern(pat)
187 if pat.startswith(u'RE:') or u'/' in pat:239 pattern_lists[_Pattern.identify(pat)].append(pat)
188 path_patterns.append(pat)240 for k, v in pattern_lists.iteritems():
189 elif pat.startswith(u'*.'):241 self._add_patterns(v, _Pattern.translators[k],
190 ext_patterns.append(pat)242 _Pattern.prefixes[k])
191 else:
192 base_patterns.append(pat)
193 self._add_patterns(ext_patterns,_sub_extension,
194 prefix=r'(?:.*/)?(?!.*/)(?:.*\.)')
195 self._add_patterns(base_patterns,_sub_basename,
196 prefix=r'(?:.*/)?(?!.*/)')
197 self._add_patterns(path_patterns,_sub_fullpath)
198243
199 def _add_patterns(self, patterns, translator, prefix=''):244 def _add_patterns(self, patterns, translator, prefix):
200 while patterns:245 while patterns:
201 grouped_rules = ['(%s)' % translator(pat) for pat in patterns[:99]]246 grouped_rules = ['(%s)' % translator(pat) for pat in patterns[:99]]
202 joined_rule = '%s(?:%s)$' % (prefix, '|'.join(grouped_rules))247 joined_rule = '%s(?:%s)$' % (prefix, '|'.join(grouped_rules))
203 self._regex_patterns.append((re.compile(joined_rule, re.UNICODE),248 self.regex_patterns.append((re.compile(joined_rule, re.UNICODE),
204 patterns[:99]))249 patterns[:99]))
205 patterns = patterns[99:]250 patterns = patterns[99:]
206251
207 def match(self, filename):252 def match(self, filename):
253 """Search for a pattern that matches the given filename.
254
255 :return A matching pattern or None if there is no matching pattern.
256 In case of a re.error exception,
257 Scan self._regex_patterns.
258 Issue a warning to the user about the bad pattern.
259 Remove bad pattern.
260 Retry match.
261 """
262 try:
263 return self._match(filename)
264 except re.error, e:
265 new_regex_patterns = []
266 for regex, patterns in self.regex_patterns:
267 try:
268 c = regex._compile_and_collapse()
269 new_regex_patterns.append((regex, patterns))
270 except re.error, e:
271 validp, invalidp = filter_invalid_patterns(patterns)
272 user_ignore = config.user_ignore_config_filename()
273 bzrignore = bzrlib.IGNORE_FILENAME
274 msg = ('Skipped invalid pattern(s):\n %s\n'
275 'Please fix above patterns in '
276 '"%s" or "%s".\n'
277 'Bazaar commands will NOT ignore files meant to '
278 'match above pattern(s).' %
279 ('\n '.join(invalidp), bzrignore, user_ignore))
280 ui.ui_factory.show_warning(msg)
281 if validp:
282 # Skip invalid pattern and accumulate valid patterns
283 # into new_regex_patterns so that they can be compiled
284 # into a single regex for better performance.
285 g = Globster(validp)
286 new_regex_patterns.append(g.regex_patterns[0])
287 self.regex_patterns = new_regex_patterns
288 return self._match(filename)
289
290 def _match(self, filename):
208 """Searches for a pattern that matches the given filename.291 """Searches for a pattern that matches the given filename.
209292
210 :return A matching pattern or None if there is no matching pattern.293 :return A matching pattern or None if there is no matching pattern.
211 """294 """
212 for regex, patterns in self._regex_patterns:295 for regex, patterns in self.regex_patterns:
213 match = regex.match(filename)296 match = regex.match(filename)
214 if match:297 if match:
215 return patterns[match.lastindex -1]298 return patterns[match.lastindex -1]
216 return None299 return None
217300
301
218class ExceptionGlobster(object):302class ExceptionGlobster(object):
219 """A Globster that supports exception patterns.303 """A Globster that supports exception patterns.
220 304
221 Exceptions are ignore patterns prefixed with '!'. Exception305 Exceptions are ignore patterns prefixed with '!'. Exception
222 patterns take precedence over regular patterns and cause a 306 patterns take precedence over regular patterns and cause a
223 matching filename to return None from the match() function. 307 matching filename to return None from the match() function.
@@ -225,7 +309,7 @@
225 as regular ignores. '!!' patterns are useful to establish ignores309 as regular ignores. '!!' patterns are useful to establish ignores
226 that apply under paths specified by '!' exception patterns.310 that apply under paths specified by '!' exception patterns.
227 """311 """
228 312
229 def __init__(self,patterns):313 def __init__(self,patterns):
230 ignores = [[], [], []]314 ignores = [[], [], []]
231 for p in patterns:315 for p in patterns:
@@ -236,7 +320,7 @@
236 else:320 else:
237 ignores[0].append(p)321 ignores[0].append(p)
238 self._ignores = [Globster(i) for i in ignores]322 self._ignores = [Globster(i) for i in ignores]
239 323
240 def match(self, filename):324 def match(self, filename):
241 """Searches for a pattern that matches the given filename.325 """Searches for a pattern that matches the given filename.
242326
@@ -250,6 +334,7 @@
250 else:334 else:
251 return self._ignores[0].match(filename)335 return self._ignores[0].match(filename)
252336
337
253class _OrderedGlobster(Globster):338class _OrderedGlobster(Globster):
254 """A Globster that keeps pattern order."""339 """A Globster that keeps pattern order."""
255340
@@ -259,17 +344,12 @@
259 :param patterns: sequence of glob patterns344 :param patterns: sequence of glob patterns
260 """345 """
261 # Note: This could be smarter by running like sequences together346 # Note: This could be smarter by running like sequences together
262 self._regex_patterns = []347 self.regex_patterns = []
263 for pat in patterns:348 for pat in patterns:
264 pat = normalize_pattern(pat)349 pat = normalize_pattern(pat)
265 if pat.startswith(u'RE:') or u'/' in pat:350 pat_type = _Pattern.identify(pat)
266 self._add_patterns([pat], _sub_fullpath)351 self._add_patterns([pat], _Pattern.translators[pat_type],
267 elif pat.startswith(u'*.'):352 _Pattern.prefixes[pat_type])
268 self._add_patterns([pat], _sub_extension,
269 prefix=r'(?:.*/)?(?!.*/)(?:.*\.)')
270 else:
271 self._add_patterns([pat], _sub_basename,
272 prefix=r'(?:.*/)?(?!.*/)')
273353
274354
275_slashes = re.compile(r'[\\/]+')355_slashes = re.compile(r'[\\/]+')
@@ -283,3 +363,38 @@
283 if len(pattern) > 1:363 if len(pattern) > 1:
284 pattern = pattern.rstrip('/')364 pattern = pattern.rstrip('/')
285 return pattern365 return pattern
366
367
368def is_pattern_valid(pattern):
369 """Returns True if pattern is valid.
370
371 :param pattern: Normalized pattern.
372 is_pattern_valid() assumes pattern to be normalized.
373 see: globbing.normalize_pattern
374 """
375 result = True
376 translator = _Pattern.translators[_Pattern.identify(pattern)]
377 tpattern = '(%s)' % translator(pattern)
378 try:
379 cpattern = real_re_compile(tpattern, re.UNICODE)
380 except re.error, e:
381 result = False
382 return result
383
384
385def filter_invalid_patterns(patterns):
386 """Returns valid and invalid patterns.
387
388 :param patterns: List of normalized patterns.
389 filter_invalid_patterns() assumes pattern to be normalized.
390 see: globbing.normalize_pattern
391 """
392 valid_patterns = []
393 invalid_patterns = []
394 for pat in patterns:
395 if is_pattern_valid(pat):
396 valid_patterns.append(pat)
397 else:
398 invalid_patterns.append(pat)
399 return valid_patterns, invalid_patterns
400
286401
=== modified file 'bzrlib/lazy_regex.py'
--- bzrlib/lazy_regex.py 2009-03-23 14:59:43 +0000
+++ bzrlib/lazy_regex.py 2010-06-19 13:29:28 +0000
@@ -58,7 +58,7 @@
5858
59 def _real_re_compile(self, *args, **kwargs):59 def _real_re_compile(self, *args, **kwargs):
60 """Thunk over to the original re.compile"""60 """Thunk over to the original re.compile"""
61 return _real_re_compile(*args, **kwargs)61 return real_re_compile(*args, **kwargs)
6262
63 def __getattr__(self, attr):63 def __getattr__(self, attr):
64 """Return a member from the proxied regex object.64 """Return a member from the proxied regex object.
@@ -97,11 +97,11 @@
97 Though the first call will reset back to the original (it doesn't97 Though the first call will reset back to the original (it doesn't
98 track nesting level)98 track nesting level)
99 """99 """
100 re.compile = _real_re_compile100 re.compile = real_re_compile
101101
102102
103_real_re_compile = re.compile103real_re_compile = re.compile
104if _real_re_compile is lazy_compile:104if real_re_compile is lazy_compile:
105 raise AssertionError(105 raise AssertionError(
106 "re.compile has already been overridden as lazy_compile, but this would" \106 "re.compile has already been overridden as lazy_compile, but this would" \
107 " cause infinite recursion")107 " cause infinite recursion")
108108
=== modified file 'bzrlib/tests/blackbox/test_add.py'
--- bzrlib/tests/blackbox/test_add.py 2010-02-23 07:43:11 +0000
+++ bzrlib/tests/blackbox/test_add.py 2010-06-19 13:29:28 +0000
@@ -18,8 +18,10 @@
18"""Tests of the 'bzr add' command."""18"""Tests of the 'bzr add' command."""
1919
20import os20import os
21import re
2122
22from bzrlib import (23from bzrlib import (
24 ignores,
23 osutils,25 osutils,
24 tests,26 tests,
25 )27 )
@@ -217,3 +219,29 @@
217 os.symlink(osutils.abspath('target'), 'tree/link')219 os.symlink(osutils.abspath('target'), 'tree/link')
218 out = self.run_bzr(['add', 'tree/link'])[0]220 out = self.run_bzr(['add', 'tree/link'])[0]
219 self.assertEquals(out, 'adding link\n')221 self.assertEquals(out, 'adding link\n')
222
223 def test_add_with_invalid_ignore(self):
224 """add should handle invalid ignore pattern gracefully.
225 """
226 tree = self.make_branch_and_tree('.')
227 self.build_tree(['top.txt'])
228 self.build_tree_contents([('.bzrignore', 'RE:*.cpp\n')])
229 out, err = self.run_bzr(['add'])
230 self.assertEquals(out, 'adding .bzrignore\nadding top.txt\n')
231 self.assertContainsRe(err,
232 'warning: Skipped invalid pattern.*RE:\*\.cpp',
233 re.DOTALL)
234
235 def test_add_with_invalid_ignore_in_global_ignore(self):
236 """add should handle invalid global ignore pattern gracefully.
237 """
238 tree = self.make_branch_and_tree('.')
239 self.build_tree(['top.txt'])
240 self.build_tree_contents([('foobar', 'foo\n')])
241 ignores.add_unique_user_ignores([u"RE:*.cpp"]) # invalid pattern
242 out, err = self.run_bzr(['add'])
243 self.assertEquals(out, 'adding foobar\nadding top.txt\n')
244 self.assertContainsRe(err,
245 'warning: Skipped invalid pattern.*RE:\*\.cpp',
246 re.DOTALL)
247
220248
=== modified file 'bzrlib/tests/blackbox/test_ignore.py'
--- bzrlib/tests/blackbox/test_ignore.py 2010-05-03 09:19:15 +0000
+++ bzrlib/tests/blackbox/test_ignore.py 2010-06-19 13:29:28 +0000
@@ -162,3 +162,35 @@
162 tree = self.make_branch_and_tree('a')162 tree = self.make_branch_and_tree('a')
163 self.run_bzr(['ignore', '--directory=a', 'README'])163 self.run_bzr(['ignore', '--directory=a', 'README'])
164 self.check_file_contents('a/.bzrignore', 'README\n')164 self.check_file_contents('a/.bzrignore', 'README\n')
165
166 def test_ignore_invalid_pattern_argument(self):
167 """Ensure graceful handling for invalid ignore pattern.
168
169 Invalid pattern should show clear error message.
170 Invalid pattern should not be added to .bzrignore file.
171 """
172 tree = self.make_branch_and_tree('.')
173 out, err = self.run_bzr(['ignore', 'RE:*.cpp']) # invalid pattern
174 self.assertEqual(out, '')
175 self.assertContainsRe(err,
176 'bzr: warning: Skipped invalid.*RE:\*\.cpp', re.DOTALL)
177 self.assertFalse(os.path.isfile('.bzrignore'))
178
179 def test_invalid_pattern_in_ignore_file(self):
180 """Existing invalid patterns should be shown.
181 """
182 tree = self.make_branch_and_tree('.')
183 self.build_tree(['file'])
184 self.build_tree_contents([('.bzrignore', '*.c\nRE:[\n')])
185 out, err = self.run_bzr(['ignore', 'RE:*.cpp', 'abc'])
186 self.assertEqual(out, '')
187 self.assertContainsRe(err, 'warning: Skipped invalid pattern'
188 '.*RE:\*\.cpp.*'
189 'warning: Skipped invalid pattern.*RE:\[', re.DOTALL)
190 f = open('.bzrignore', 'r')
191 try:
192 lines = f.readlines()
193 self.assertTrue('abc\n' in lines)
194 finally:
195 f.close()
196
165197
=== modified file 'bzrlib/tests/blackbox/test_ignored.py'
--- bzrlib/tests/blackbox/test_ignored.py 2010-05-02 20:10:25 +0000
+++ bzrlib/tests/blackbox/test_ignored.py 2010-06-19 13:29:28 +0000
@@ -17,6 +17,8 @@
1717
18"""Tests of the 'bzr ignored' command."""18"""Tests of the 'bzr ignored' command."""
1919
20import re
21
20from bzrlib.tests.blackbox import ExternalBase22from bzrlib.tests.blackbox import ExternalBase
2123
2224
@@ -46,3 +48,16 @@
46 ('a/.bzrignore', 'README')])48 ('a/.bzrignore', 'README')])
47 out, err = self.run_bzr(['ignored', '--directory=a'])49 out, err = self.run_bzr(['ignored', '--directory=a'])
48 self.assertStartsWith(out, 'README')50 self.assertStartsWith(out, 'README')
51
52 def test_invalid_pattern_in_ignore_file(self):
53 """Existing invalid patterns should be shown.
54 """
55 tree = self.make_branch_and_tree('.')
56 self.build_tree(['file', 'foo.c'])
57 self.build_tree_contents([('.bzrignore', '*.c\nRE:[\n')])
58 out, err = self.run_bzr(['ignored'])
59 self.assertContainsRe(out, 'foo\.c[ ]+\*\.c')
60 self.assertContainsRe(err,
61 'warning: Skipped invalid pattern.*RE:\[',
62 re.DOTALL)
63
4964
=== modified file 'bzrlib/tests/blackbox/test_ls.py'
--- bzrlib/tests/blackbox/test_ls.py 2010-05-02 20:10:25 +0000
+++ bzrlib/tests/blackbox/test_ls.py 2010-06-19 13:29:28 +0000
@@ -16,6 +16,7 @@
1616
17"""External tests of 'bzr ls'"""17"""External tests of 'bzr ls'"""
1818
19import re
19import os20import os
2021
21from bzrlib import ignores, osutils22from bzrlib import ignores, osutils
@@ -244,3 +245,25 @@
244 self.wt.commit('commit')245 self.wt.commit('commit')
245 self.ls_equals('sub/\nsub/file\n', '--directory=dir')246 self.ls_equals('sub/\nsub/file\n', '--directory=dir')
246 self.ls_equals('sub/file\n', '-d dir sub')247 self.ls_equals('sub/file\n', '-d dir sub')
248
249 def test_ls_invalid_ignore_pattern(self):
250 """Tests status with invalid ignore pattern"""
251 f = open('.bzrignore', 'w')
252 try:
253 f.write(u"RE:*.cpp\n") # invalid pattern
254 finally:
255 f.close()
256 self.build_tree(['bye.c'])
257 out, err = self.run_bzr(['ls', '-i'])
258 self.assertEquals(out, '')
259 self.assertContainsRe(err,
260 'warning: Skipped invalid pattern.*RE:\*\.cpp',
261 flags=re.DOTALL)
262 out, err = self.run_bzr(['ls', '-u'])
263 self.assertEquals(out, '.bzrignore\na\nbye.c\n')
264 self.assertContainsRe(err,
265 'warning: Skipped invalid pattern.*RE:\*\.cpp',
266 flags=re.DOTALL)
267
268
269
247270
=== modified file 'bzrlib/tests/blackbox/test_status.py'
--- bzrlib/tests/blackbox/test_status.py 2010-04-07 21:34:13 +0000
+++ bzrlib/tests/blackbox/test_status.py 2010-06-19 13:29:28 +0000
@@ -25,6 +25,7 @@
25from cStringIO import StringIO25from cStringIO import StringIO
26import codecs26import codecs
27from os import mkdir, chdir, rmdir, unlink27from os import mkdir, chdir, rmdir, unlink
28import re
28import sys29import sys
29from tempfile import TemporaryFile30from tempfile import TemporaryFile
3031
@@ -32,6 +33,7 @@
32 bzrdir,33 bzrdir,
33 conflicts,34 conflicts,
34 errors,35 errors,
36 ignores,
35 osutils,37 osutils,
36 )38 )
37import bzrlib.branch39import bzrlib.branch
@@ -701,6 +703,73 @@
701 out, err = self.run_bzr('status tree/a')703 out, err = self.run_bzr('status tree/a')
702 self.assertNotContainsRe(out, 'pending merge')704 self.assertNotContainsRe(out, 'pending merge')
703705
706 def test_status_invalid_ignore_pattern(self):
707 """Tests status with invalid ignore pattern"""
708 wt = self.make_branch_and_tree('.')
709 f = open('.bzrignore', 'w')
710 try:
711 f.write(u"RE:*.cpp\n") # invalid pattern
712 finally:
713 f.close()
714 wt.commit('commit .bzrignore')
715 self.build_tree(['bye.c'])
716 out, err = self.run_bzr(['status'])
717 self.assertEquals(out, 'unknown:\n .bzrignore\n bye.c\n')
718 self.assertContainsRe(err,
719 'warning: Skipped invalid pattern.*RE:\*\.cpp',
720 flags=re.DOTALL)
721
722 def test_status_invalid_ignore_pattern_in_global_ignore(self):
723 """Tests status with invalid ignore pattern in .bazaar/ignore"""
724 wt = self.make_branch_and_tree('.')
725 self.build_tree(['file'])
726 wt.add('file')
727 wt.commit('one')
728 # add unknown file
729 f = open('foobar', 'w')
730 try:
731 f.write(u"foo\n") # invalid pattern
732 finally:
733 f.close()
734 ignores.add_unique_user_ignores([u"RE:*.cpp"]) # invalid pattern
735 out, err = self.run_bzr(['status'])
736 self.assertEquals(out, 'unknown:\n foobar\n')
737 self.assertContainsRe(err,
738 'warning: Skipped invalid.*RE:\*\.cpp',
739 flags=re.DOTALL)
740
741 def test_short_status_invalid_ignore_pattern(self):
742 """Tests short status with invalid ignore pattern"""
743 wt = self.make_branch_and_tree('.')
744 f = open('.bzrignore', 'w')
745 try:
746 f.write(u"RE:*.cpp\n") # invalid pattern
747 finally:
748 f.close()
749 wt.commit('commit .bzrignore')
750 out, err = self.run_bzr(['status', '-S'])
751 self.assertEquals(out, '? .bzrignore\n')
752 self.assertContainsRe(err,
753 'warning: Skipped invalid.*RE:\*\.cpp',
754 flags=re.DOTALL)
755
756 def test_specific_status_invalid_ignore_pattern(self):
757 """Tests specific status with invalid ignore pattern"""
758 wt = self.make_branch_and_tree('.')
759 self.build_tree(['file'])
760 wt.add('file')
761 wt.commit('added file')
762 f = open('.bzrignore', 'w')
763 try:
764 f.write(u"RE:*.cpp\n") # invalid pattern
765 finally:
766 f.close()
767 wt.commit('commit .bzrignore')
768 out, err = self.run_bzr(['status', 'file'])
769 self.assertEquals(out, '')
770 self.assertContainsRe(err,
771 'warning: Skipped invalid.*RE:\*\.cpp',
772 flags=re.DOTALL)
704773
705class TestStatusEncodings(TestCaseWithTransport):774class TestStatusEncodings(TestCaseWithTransport):
706775
707776
=== modified file 'bzrlib/tests/test_globbing.py'
--- bzrlib/tests/test_globbing.py 2010-02-17 17:11:16 +0000
+++ bzrlib/tests/test_globbing.py 2010-06-19 13:29:28 +0000
@@ -16,10 +16,13 @@
16# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA16# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
1717
18from bzrlib.globbing import (18from bzrlib.globbing import (
19 ExceptionGlobster,
20 filter_invalid_patterns,
19 Globster,21 Globster,
20 ExceptionGlobster,22 is_pattern_valid,
23 normalize_pattern,
21 _OrderedGlobster,24 _OrderedGlobster,
22 normalize_pattern25 _Pattern,
23 )26 )
24from bzrlib.tests import (27from bzrlib.tests import (
25 TestCase,28 TestCase,
@@ -37,11 +40,11 @@
37 for name in positive:40 for name in positive:
38 self.failUnless(globster.match(name), repr(41 self.failUnless(globster.match(name), repr(
39 u'name "%s" does not match glob "%s" (re=%s)' %42 u'name "%s" does not match glob "%s" (re=%s)' %
40 (name, glob, globster._regex_patterns[0][0].pattern)))43 (name, glob, globster.regex_patterns[0][0].pattern)))
41 for name in negative:44 for name in negative:
42 self.failIf(globster.match(name), repr(45 self.failIf(globster.match(name), repr(
43 u'name "%s" does match glob "%s" (re=%s)' %46 u'name "%s" does match glob "%s" (re=%s)' %
44 (name, glob, globster._regex_patterns[0][0].pattern)))47 (name, glob, globster.regex_patterns[0][0].pattern)))
4548
46 def assertMatchBasenameAndFullpath(self, matchset):49 def assertMatchBasenameAndFullpath(self, matchset):
47 # test basename matcher50 # test basename matcher
@@ -371,3 +374,37 @@
371 """tests that multiple mixed slashes are collapsed to single forward374 """tests that multiple mixed slashes are collapsed to single forward
372 slashes and trailing mixed slashes are removed"""375 slashes and trailing mixed slashes are removed"""
373 self.assertEqual(u'/foo/bar', normalize_pattern(u'\\/\\foo//\\///bar/\\\\/'))376 self.assertEqual(u'/foo/bar', normalize_pattern(u'\\/\\foo//\\///bar/\\\\/'))
377
378class TestInvalidPattern(TestCase):
379
380 def test_is_pattern_valid(self):
381 """Ensure is_pattern_valid() detects invalid patterns.
382 """
383 self.assertTrue(is_pattern_valid(normalize_pattern(u'foo')))
384 self.assertFalse(is_pattern_valid(normalize_pattern(u'RE:*.cpp')))
385
386 def test_filter_invalid_patterns(self):
387 """Ensure filter_invalid_patterns() returns only valid patterns.
388 """
389 patterns = [u'abc', u'RE:*.cpp', u'[', u'def']
390 patterns = [normalize_pattern(p) for p in patterns]
391 ref_valid_patterns = set([u'abc', u'def'])
392 ref_invalid_patterns = set([u'RE:*.cpp', u'['])
393 valid_patterns, invalid_patterns = filter_invalid_patterns(patterns)
394 self.assertEqual(ref_valid_patterns, set(valid_patterns))
395 self.assertEqual(ref_invalid_patterns, set(invalid_patterns))
396
397class TestPatternClass(TestCase):
398
399 def test_pattern_category(self):
400 """Ensure pattern categories are identified correctly.
401 """
402 fullpath0 = normalize_pattern(u'/home/jrandom')
403 fullpath1 = normalize_pattern(u'RE:foobar')
404 extension = normalize_pattern(u'*.py')
405 basename = normalize_pattern(u'random_file')
406 self.assertEqual(_Pattern.identify(fullpath0), _Pattern.TYPE_FULLPATH)
407 self.assertEqual(_Pattern.identify(fullpath1), _Pattern.TYPE_FULLPATH)
408 self.assertEqual(_Pattern.identify(extension), _Pattern.TYPE_EXTENSION)
409 self.assertEqual(_Pattern.identify(basename), _Pattern.TYPE_BASENAME)
410
374411
=== modified file 'bzrlib/tests/test_ignores.py'
--- bzrlib/tests/test_ignores.py 2010-05-03 04:51:58 +0000
+++ bzrlib/tests/test_ignores.py 2010-06-19 13:29:28 +0000
@@ -217,3 +217,13 @@
217 ignores.tree_ignores_add_patterns(tree, ["myentry2", "foo"])217 ignores.tree_ignores_add_patterns(tree, ["myentry2", "foo"])
218 self.assertEquals(open('.bzrignore', 'rb').read(), 'myentry1\r\nmyentry2\r\nfoo\r\n')218 self.assertEquals(open('.bzrignore', 'rb').read(), 'myentry1\r\nmyentry2\r\nfoo\r\n')
219 self.assertPatternsEquals(["myentry1", "myentry2", "foo"])219 self.assertPatternsEquals(["myentry1", "myentry2", "foo"])
220
221 def test_bad_pattern(self):
222 """tree.is_ignored() should handle bad patterns.
223 """
224 tree = self.make_branch_and_tree(".")
225 self.build_tree(['foo.c'])
226 self.build_tree_contents([('.bzrignore', u"RE:*.cpp\nfoo.py\n")])
227 tree.add([".bzrignore"])
228 self.assertFalse(tree.is_ignored('foo.c'))
229