Merge lp:~parthm/bzr/300062-better-handling-for-invalid-ignore-pattern into lp:bzr

Proposed by Parth Malwankar on 2010-05-31
Status: Work in progress
Proposed branch: lp:~parthm/bzr/300062-better-handling-for-invalid-ignore-pattern
Merge into: lp:bzr
Diff against target: 698 lines (+360/-66)
17 files modified
NEWS (+8/-0)
bzrlib/builtins.py (+11/-2)
bzrlib/errors.py (+7/-0)
bzrlib/globbing.py (+134/-4)
bzrlib/ignores.py (+2/-3)
bzrlib/lazy_regex.py (+13/-6)
bzrlib/log.py (+1/-3)
bzrlib/osutils.py (+0/-23)
bzrlib/tests/__init__.py (+1/-2)
bzrlib/tests/blackbox/test_add.py (+30/-0)
bzrlib/tests/blackbox/test_ignore.py (+33/-0)
bzrlib/tests/blackbox/test_ignored.py (+14/-0)
bzrlib/tests/blackbox/test_log.py (+2/-5)
bzrlib/tests/blackbox/test_status.py (+53/-0)
bzrlib/tests/test_globbing.py (+50/-0)
bzrlib/tests/test_ignores.py (+1/-0)
bzrlib/tests/test_osutils.py (+0/-18)
To merge this branch: bzr merge lp:~parthm/bzr/300062-better-handling-for-invalid-ignore-pattern
Reviewer Review Type Date Requested Status
John A Meinel Needs Fixing on 2010-06-28
Robert Collins (community) 2010-05-31 Needs Fixing on 2010-06-21
Review via email: mp+26400@code.launchpad.net

Commit message

Improved handling of invalid ignore patterns by add, ignore, ignored and status.

Description of the change

=== Fixes Bug #300062 ===

This started out as a simple bug fix to ensure that 'bzr ignore' handles invalid patterns correctly but as invalid patterns impacted other commands this patch ended up fixing those as well. bzr can encounter invalid patterns in either .bzrignore file or .bazaar/ignore file. These could earlier get in either by using 'bzr ignore' or by editing the files directly. The former is now fixed. This patch fixes the following:
1. 'bzr ignore' when supplied with possibly multiple patterns, filters out invalid patters and only adds valid patterns to .bzrignore

2. 'bzr status' checks ignore patterns before showing status. This was crashing if invalid patterns were present. 'bzr status' now issues a warning if invalid pattern(s) are seen and does not crash. In order to avoid any performance penalty, `status` works as before, but if it sees sees re.error then it parser both ignore files and prints out the bad pattern(s).

3. 'bzr add' checks ignore patterns before adding files. This was crashing if invalid patterns were present. 'bzr add' now exists with an error message indicating invalid patterns. Initially I was trying to decide between a warning or error for this but decided that its better for `add` to not be too smart as it might end up adding files that were meant to be ignored.

Implementations Notes
=====================
As `re` is lazy by default. I have added a function `bzrlib.lazy_regex.eager_re_compile` that is used by `bzrlib.globbing.is_pattern_valid` which in turn is used by to create `bzrlib.globbing.filter_invalid_patterns`. This is used by `ignore` command to ensure that no junk enters .bzrignore.

`bzrlib.ignores` has two new functions, `get_pattern_errors` gives information about bad patterns in .bzrignore and .bazaar/ignore. `show_pattern_errors` is used to display warning about bad patterns to the user. `show_pattern_errors` is used by `status`. `get_pattern_errors` is used by `add` which raises error `InvalidIgnorePattern`.

I noticed that imports in bzrlib.status and bzrlib.ignores were not lazy. So I have made these lazy and I had to add a few imports to these files.

Tests have been added as needed.

To post a comment you must log in.
Robert Collins (lifeless) wrote :

So, this is comprehensive, and I like that. I have a general point,
and a couple of specifics. I haven't done a line by line reading [yet,
 I will after the general point is discussed].

Firstly, the general point. I think its fine, if the ignore file is
damaged to just stop. add doesn't need to be able to succeed when the
environment is broken. Stop (let the exception propogate) and have the
user fix the issue. Better that than adding something they wanted
ignored, I think. Similarly for status.

Now, I might be misreading, and thats what you're doing : but if thats
what you're doing, its leaking up into code that it shouldn't. So I'd
like to see this patch with no changes to code that uses ignores - no
changes should be needed to do what I've described. This is because I
don't think the approach of scattering try/excepts around in higher
level code makes sense: push it down into the ignores module.

Secondly, the specific issue - to expose the real re compile function,
just assign it to a variable that can be accessed: wrappers of the
form:
def a_foo(param):
   result = real_foo(param)
   return result

can be rewritten as
def a_foo(param):
    return real_foo(param)

which is precisely equivalent to
a_foo = real_foo

So in this case, just call the _real_re_compile function, as is. No
changes needed.

sidenote: I rather suspect theres a bunch of edits here we may end up
not wanting, and perhaps we'll want to flatten the history to avoid
that overhead.

-Rob

Robert Collins (lifeless) wrote :

Oh, and one more thing: the tests you're writing are very far away
from the code you're testing; please consider using tests for the
specific code, not blackbox tests. blackbox tests should really only
be needed when you're changing code in builtins.py or the few other
command-class containing files.

-Rob

Parth Malwankar (parthm) wrote :

> So, this is comprehensive, and I like that. I have a general point,
> and a couple of specifics. I haven't done a line by line reading [yet,
> I will after the general point is discussed].
>
> Firstly, the general point. I think its fine, if the ignore file is
> damaged to just stop. add doesn't need to be able to succeed when the
> environment is broken. Stop (let the exception propogate) and have the
> user fix the issue. Better that than adding something they wanted
> ignored, I think. Similarly for status.
>

I have update `status` to fail as well. `add` was already raising an
exception for invalid pattern. Making it consistent does make the
code cleaner.

> Secondly, the specific issue - to expose the real re compile function,
> just assign it to a variable that can be accessed: wrappers of the
> form:
> def a_foo(param):
> result = real_foo(param)
> return result
>
> can be rewritten as
> def a_foo(param):
> return real_foo(param)
>
> which is precisely equivalent to
> a_foo = real_foo
>
>
> So in this case, just call the _real_re_compile function, as is. No
> changes needed.
>

I was reluctant to use _real_re_compile as it was private. Its now
renamed to real_re_compile and used directly.

> sidenote: I rather suspect theres a bunch of edits here we may end up
> not wanting, and perhaps we'll want to flatten the history to avoid
> that overhead.
>

I am not sure I understand. Do you mean using bzr-rewrite?

> -Rob

Thanks for the review.

Parth Malwankar (parthm) wrote :

> Oh, and one more thing: the tests you're writing are very far away
> from the code you're testing; please consider using tests for the
> specific code, not blackbox tests. blackbox tests should really only
> be needed when you're changing code in builtins.py or the few other
> command-class containing files.
>

Done. I have added unit tests for the new functions along with the
blackbox tests.

Parth Malwankar (parthm) wrote :

Some other things are fixed in the latest set of changes:
 - `ignored` was crashing on invalid pattern.
 - lock/unlock for ignore wasn't in try/finally or cleanup.
 - As InvalidPattern exception is thrown from the lowest level, any uncaught regex error in bzr will give a InvalidPattern bzr error rather than stack trace. `add`, `ignore`, `ignored`, `status` catch this to give bad ignore pattern info.

Robert Collins (lifeless) wrote :

Why do they need to catch it? Why can't the exception __str__ just
show it, avoiding all changes to add/ignore/ignored and status
commands?

-Rob

John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> Why do they need to catch it? Why can't the exception __str__ just
> show it, avoiding all changes to add/ignore/ignored and status
> commands?
>
> -Rob

Just mentioning that I don't think falling over and failing on an error
is the best route. When people want to run "bzr status" they want it to
complete. Completely blocking the user because they have a typo in their
ignore file is not going to win "user-friendly" points.

Obviously Robert and I disagree here, though, so we may need someone
else to chime in as a tiebreaker.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwFHfQACgkQJdeBCYSNAAM68QCgkT7TycEQ5NeRcs1rXgfRklrl
Bq8AoKfmmTzMfK+IbGrwaQ41kEuJWWvf
=n8/E
-----END PGP SIGNATURE-----

Parth Malwankar (parthm) wrote :

> Why do they need to catch it? Why can't the exception __str__ just
> show it, avoiding all changes to add/ignore/ignored and status
> commands?
>
> -Rob

regex are used in other places like `log -m`. So, a failure
doesn't necessarily mean its an ignore pattern issue. Hence,
the caller needs to decide whats the best way to deal with
it. Also, in some cases we want to show a warning (ignore)
and in others, an error (add).

Parth Malwankar (parthm) wrote :

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Robert Collins wrote:
> > Why do they need to catch it? Why can't the exception __str__ just
> > show it, avoiding all changes to add/ignore/ignored and status
> > commands?
> >
> > -Rob
>
> Just mentioning that I don't think falling over and failing on an error
> is the best route. When people want to run "bzr status" they want it to
> complete. Completely blocking the user because they have a typo in their
> ignore file is not going to win "user-friendly" points.
>

Having `status` work seems reasonable to me because in case of failed
ignores the files displayed end up as 'unknown' and the rest of the
status should be the same. The code does get a little ugly as we need
to catch the exception in three places in status.py.

Robert Collins (lifeless) wrote :

John, lets explore this a little more.

Possibly the most common case of a broken ignore file will be new
users, learning how to use the system, and running 'bzr st' on a newly
inited tree, showing hundreds or thousands of files.

I don't think a little warning message shown partway through the
output will win user friendly points - and they may well be scratching
their head going 'why won't it ignore what I asked it to': people
learning a new system can often get rattled.

I don't think stopping cold is necessarily better than outputting
details about whats wrong, but I am not at all convinced that
completing-with-potentially-very-many-unknowns is better.

I'm concerned about having more try:excepts in the very innermost loop
of what we do, we've had to take them out of code paths before, and
status is one of the most heavily exercised paths. What I saw of the
earlier diff had try: excepts: inside inner loops, rather than once,
at compilation of the regex time. Data will help (test on a 20K file
tree, with 40K ignored files (.o's and .d's)

Also, as we only compile *once* there is a clear bug - if the
categorisation fails on the first file, and the latter ones just fail
open to unknowns, then first file *also* should fail open to an
unknown, and there is no reason to be raising an exception - just log
directly from the compilation failure point.

Martin Pool (mbp) wrote :

On 2 June 2010 00:52, John A Meinel <email address hidden> wrote:
> Just mentioning that I don't think falling over and failing on an error
> is the best route. When people want to run "bzr status" they want it to
> complete. Completely blocking the user because they have a typo in their
> ignore file is not going to win "user-friendly" points.

I would tend to expect it to give a user warning not an error. If
it's unreasonably messy or slow to implement I wouldn't absolutely
insist on it. A user friendly error is an improvement on an internal
error.

--
Martin <http://launchpad.net/~mbp/>

Robert Collins (lifeless) wrote :

To be clear, I was suggesting a user user - nicely formatted, and stop
the operation, not a traceback or InternalError.

Parth Malwankar (parthm) wrote :

This is how the output currently looks (with status showing an error):

[aaa]% ~/src/bzr.dev/300062-better-handling-for-invalid-ignore-pattern/bzr ignore "RE:[" aaa
bzr: warning: Skipped invalid pattern(s):
  RE:[
========================================================================
[aaa]% echo "RE:[" >> .bzrignore
[aaa]% ~/src/bzr.dev/300062-better-handling-for-invalid-ignore-pattern/bzr st
bzr: error: Invalid pattern(s) found in "/home/parthm/tmp/aaa/.bzrignore":
  RE:[
bzr: ERROR: Invalid pattern(s) found.
========================================================================
[aaa]% ~/src/bzr.dev/300062-better-handling-for-invalid-ignore-pattern/bzr add
bzr: error: Invalid pattern(s) found in "/home/parthm/tmp/aaa/.bzrignore":
  RE:[
bzr: ERROR: Invalid pattern(s) found.
========================================================================
[aaa]% echo "RE:*.cpp" >> ~/.bazaar/ignore
[aaa]% ~/src/bzr.dev/300062-better-handling-for-invalid-ignore-pattern/bzr add
bzr: error: Invalid pattern(s) found in "/home/parthm/tmp/aaa/.bzrignore":
  RE:[
bzr: error: Invalid pattern(s) found in "/home/parthm/.bazaar/ignore":
  RE:*.cpp
bzr: ERROR: Invalid pattern(s) found.
[aaa]%

Parth Malwankar (parthm) wrote :
Download full text (3.6 KiB)

I had an interesting discussion with jam regarding warning or error on irc. Unfortunately no conclusion yet :P

<jam> hi parthm
<parthm> jam: hi, i was thinking some more about the `bzr st` bad ignore pattern
i am somewhat divided between warning / error but am leaning towards error. if we have a bad pattern other commands like `add` and `ignored` would fail anyway.
<parthm> i am not sure if there are others. so it may be better to fail quickly and have the user fix it sooner than later.
<parthm> there could also be a case that a user ends up pushing a bad pattern to their server/lp. as .bzrignore is already added. its just a `bzr ci` and `bzr push`.
so failing on `st` may be a good idea.
<jam> parthm: why would add and ignore have to fail?
the point is that, especially right now, people may have a bad config
having it fall over when it used to work is bad
having it fall over *in general* is bad
I don't feel super strongly
but we've done the 'error early' in the past
and generally it has provided worse user experience
as a minor misconfiguration in X
suddenly blocks Y and Z from working at all
<parthm> jam: `ignore` works, it issues a warning and discards the bad pattern.
<jam> even if the pattern was already in the file?
<parthm> `ignored` fails as we compile all ignored patterns (99 at a time) into one pattern. so we cant selectively ignore.
<jam> you certainly *could*
<parthm> jam: yes. if a file has a bad pattern, `ignore` would still work. so it issues a warning about the pattern on the command line. if the tree is processed, then a followup error is shown about the existing bad pattern.
<jam> parthm: the problem is that if we filter out the bad pattern from the disk content we make it harder to get the right pattern
since the old one is gone
it may be a fair trade
but something to be thought about
<parthm> jam: we don't filter out the bad pattern from the file. filtering is only for the command like. we display an error message for the pattern in file.
jam: http://pastebin.com/fMWtEUjc
<jam> so for compiling, we could certainly attempt the 100-way compile and if it fails rollback
<parthm> jam: i am not sure i understand. are you talking about `ignored`?
<jam> yes
but also statu
status
add, etc
<parthm> jam: what do you mean by try the compile and rollback?
<jam> parthm: try to compile all 100, if it fails, compile them individually to find the one that failed
<parthm> jam: thats whats done in the current patch. the current patch allow the operation to happen, if there is an exception, it processes the file to find the bad pattern and displays it along with the error message.
<jam> so why does ignored *have* to fail?
<parthm> jam: so are you suggesting we retry after filtering out the bad pattern at runtime?
<jam> That is what I've been suggesting, yes
<parthm> jam: IMO once the user has a bad pattern in the file he would sooner or later end up with a problem so it may be better to just error out pointing to the pattern.
i don't think existing installation would have a bad pattern as it would make bzr crash.
<jam> parthm: the difference is whether someone can still finish what they were trying to do
*before* they have to fix everything
o...

Read more...

Vincent Ladeuil (vila) wrote :

+1 for warning, blocking a user is bad, especially if .bzrignore is
versioned and the said user just got one with a bad pattern (some people
will do that)

Parth Malwankar (parthm) wrote :

It seem to me that the majority prefers bzr to be more forgiving about bad patterns. I can go ahead with a warning based approach if there are no objections to this over the next day or two.

Here is what I understand.

1. `add` - This should fail (error) as we don't wan't to add file that the user may want to ignore
2. `ignore` - Show warning and skip any bad patterns found on the command line.
3. `ignored` - Try to show ignored files. If this errors out with bad regex, filter out the faulting regex(es) at runtime and retry. The disk files are not changes. Issue a warning about faulting regexes.
4. `status` - If we see a regex failure, filter out the faulting pattern at runtime and retry the ignored files. The retry based approach should reduce the number of 'unknown' files listed.

This would also impact `ls -i/-u`. I will look into that some more. I suppose we would want a similar retry based behavior for this also.

Any further thoughts/ideas about the user interaction should be? Any more commands that might be impacted?

John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Parth Malwankar wrote:
> It seem to me that the majority prefers bzr to be more forgiving about bad patterns. I can go ahead with a warning based approach if there are no objections to this over the next day or two.
>
> Here is what I understand.
>
> 1. `add` - This should fail (error) as we don't wan't to add file that the user may want to ignore

I would actually disagree and just warn. The user can always 'bzr
revert' or 'bzr rm'. Especially if they were supplying explicit paths
("bzr add foo" shouldn't care about ignore patterns, because it always
adds foo even if it was ignored.)

> 2. `ignore` - Show warning and skip any bad patterns found on the command line.
> 3. `ignored` - Try to show ignored files. If this errors out with bad regex, filter out the faulting regex(es) at runtime and retry. The disk files are not changes. Issue a warning about faulting regexes.
> 4. `status` - If we see a regex failure, filter out the faulting pattern at runtime and retry the ignored files. The retry based approach should reduce the number of 'unknown' files listed.
>
> This would also impact `ls -i/-u`. I will look into that some more. I suppose we would want a similar retry based behavior for this also.
>
> Any further thoughts/ideas about the user interaction should be? Any more commands that might be impacted?
>

I would look to do it the 'cleanest' way internally. My feeling is that
turning everything into a warning is likely to be fairly straightforward
and the most worthwhile.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwH7UQACgkQJdeBCYSNAAN3hACgvtuI/Jb1yJ8pJQPsGxwfG0f0
eF8AoI+7+E1fKzOZZ4kq3phkxCJmTEXL
=fiKw
-----END PGP SIGNATURE-----

Robert Collins (lifeless) wrote :

I'd like to see a really clean internal behaviour + a good user experience.

I don't agree with John that the user may have a bad ignore file
already: compilation will be breaking, and it will be throwing the
native RE error at the moment, so existing, broken files are fine to
error on. The point is to error:
 - cleanly
 - helpfully
 - in a way they can correct

(and correcting a .bzrignore file is easy)

I have a strong feeling that, particularly for add and status in newly
created trees, erroring is actually more user friendly that warning:
warnings can get lost and ignored, particularly when you're dealing
with hundreds or thousands of lines of output. However, I think any
improvement is better than what we have today, so we should just do
whatever Parth wants to do here, and iterate when we find out its
wrong ;).

In terms of internal code, I'd like it if whatever approach is taken:
 - does not need handling code per-code-path-using-ignores
 - is entirely encapsulated in ignores.py
 - doesn't add a new exception handler to every unversioned file
considered by add

This is totally doable, I think, for either warnings or erroring, and
not something I expect any of us to disagree on:)

-Rob

Robert Collins (lifeless) wrote :

Ok, I've looked through this again, and I'd like to suggest a deeper implementation strategy.

From the top level, whats going on here is this:
 - working trees have some user data (ignore rules)
 - we compile the user rules into an intermediary language (a list of regexes which we | together)
 - we lazy-compile that intermediary form into a list of re's we can match things against.

So there are the following translation steps that can fail:
 - ignores line -> string regex
 - string regex -> compiled regex

Constraint wise we don't want to compile re's more than once in normal circumstances.

We don't want to expose internal workings unless needed.

We want all commands that use ignores to benefit.

If you look at ExceptionGlobster and Globster, these objects already know about all the patterns, and they are responsible for the glob->re compilation too. As you have made their match raise InvalidPattern error, its actually entirely reasonable for the outer most one (ExceptionGlobster) to when it has a pattern error, check all of its regexs on the spot, and raise an InvalidPatternError which has all the patterns that are wrong.

That won't affect log because log doesn't using globbing (and if it did that would be ok too, because it would show the bad pattern); it will also help 'rules' if you do this to _OrderedGlobster as well.

So in short:
 - unwind all the changes to builtins and ignores, they are the wrong level and API to be fixing this; change just globbing.py and add the new error to errors.py; add some unit tests for this case to *Globster.match, and we're done.

review: Needs Fixing
5288. By Parth Malwankar on 2010-06-21

merged in changes from trunk

5289. By Parth Malwankar on 2010-06-22

better invalid pattern error handling.
Invalid patterns are now handled in a single location,
bzrlib.globbing.Globster.

Parth Malwankar (parthm) wrote :

> Ok, I've looked through this again, and I'd like to suggest a deeper
> implementation strategy.

Thanks for the review and comment. I have cleanup the error handling now.
Basically, the following is done:
1. General regex compile failure now raises InvalidPattern error.
2. Invalid pattern errors are now raised only by bzrlib.globbing.Globster.match
3. `bzr ignore PATTERN..` skips invalid patterns passed. Existing errors are handled by bzrlib.globbing.Globster.match.

John A Meinel (jameinel) wrote :
Download full text (4.7 KiB)

1) It might be clearer to resubmit this, since it has changed fairly significantly since the beginning. (So I don't think Robert's Needs Fixing still directly applies.)

2)
61 +class InvalidPattern(BzrError):
62 +
63 + _fmt = ('Invalid pattern(s) found.')
64 +
65 + def __init__(self, message):
66 + self.message = message
67 +

^- You set a message, but then never display it (there is nothing in _fmt which shows the attributes.)

I suppose in builtins you do:
29 + if invalid_patterns:
30 + pats = '\n '.join(invalid_patterns)
31 + ui.ui_factory.show_warning(
32 + 'Skipped invalid pattern(s):\n %s' % pats)
^- which explicitly displays the bad patterns, but then you do so in a warning and not an error. Which seems to disagree with your "now raises exceptions" comment.

3) It seems very strange to me that 'show_pattern_errors' is a module-level function that re-reads the ignore files and reports on them.

Rather than, for example, having the state of the Globster used to determine what patterns are invalid.

It means that anyone using Globster for anything that isn't *bzr* (imagine qbzr using it for something) is going to get strange results. (As it will tell them about problems with something that isn't related at all.)

4) so for the construct here:
105 + except errors.InvalidPattern, e:
106 + # lazy_regex throws InvalidPattern in case of bad pattern.
107 + # For Globbing, the bad pattern may have come from an ignore
108 + # file, so we should any bad patterns that the ignore file(s)
109 + # may have.
110 + show_pattern_errors()
111 + raise errors.InvalidPattern(str(e))

It seems better to me to do the formatting and put that on the errors.InvalidPattern that is getting raised, rather than calling show_pattern_errors() and having that display.

I also find it strange to catch InvalidPattern and then raise another InvalidPattern(str(first_e))

Especially given your formatting for InvalidPattern this would become:
    InvalidPattern('Invalid pattern(s) found.')

Which would lose any sort of context that the original error tried to pass on. (So if a low level exception raised InvalidPattern('pattern XXX is broken') then you catch it and re-raise a InvalidPattern('Invalid pattern(s) found.') which is pretty unhelpful.)

5) In this test:
351 + def test_add_invalid_ignore_in_global_ignore(self):
352 + """add should handle invalid global ignore pattern gracefully.
353 + """
354 + tree = self.make_branch_and_tree('.')
355 + self.build_tree(['top.txt'])
356 + self.build_tree_contents([('foobar', 'foo\n')])
357 + ignores.add_unique_user_ignores([u"RE:*.cpp"]) # invalid pattern
358 + out, err = self.run_bzr(['add'], 3)
359 + self.assertEquals(out, '')
360 + self.assertContainsRe(err,
361 + ('error: Invalid pattern.*RE:\*\.cpp.*ERROR: '
362 + 'Invalid pattern'),
363 + re.DOTALL)
364 +

^- You make it so that 'ignores.add_unique_user_ignores()' allows an invalid pattern. It seems more logical to fault that at insert time, rather than at 'bzr add' time. (We probably still need to fail there, but we should first forbid people from adding broken entries in the first place.)

388 + def test_invalid_pattern_in_ignore_file(self):
389 + """Existing inval...

Read more...

review: Needs Fixing
Parth Malwankar (parthm) wrote :
Download full text (5.7 KiB)

> 1) It might be clearer to resubmit this, since it has changed fairly
> significantly since the beginning. (So I don't think Robert's Needs Fixing
> still directly applies.)
>
> 2)
> 61 +class InvalidPattern(BzrError):
> 62 +
> 63 + _fmt = ('Invalid pattern(s) found.')
> 64 +
> 65 + def __init__(self, message):
> 66 + self.message = message
> 67 +
>
>
> ^- You set a message, but then never display it (there is nothing in _fmt
> which shows the attributes.)
>

Good catch. Will fix it and resubmit.

> I suppose in builtins you do:
> 29 + if invalid_patterns:
> 30 + pats = '\n '.join(invalid_patterns)
> 31 + ui.ui_factory.show_warning(
> 32 + 'Skipped invalid pattern(s):\n %s' % pats)
> ^- which explicitly displays the bad patterns, but then you do so in a warning
> and not an error. Which seems to disagree with your "now raises exceptions"
> comment.
>

This is for the `bzr ignore` command where if the user gives 'bzr good1 good2 bad1 good3 bad2', good1-3 get added to .bzrignore and a warning is issued for bad1-2.

> 3) It seems very strange to me that 'show_pattern_errors' is a module-level
> function that re-reads the ignore files and reports on them.
>
> Rather than, for example, having the state of the Globster used to determine
> what patterns are invalid.
>
> It means that anyone using Globster for anything that isn't *bzr* (imagine
> qbzr using it for something) is going to get strange results. (As it will tell
> them about problems with something that isn't related at all.)
>

One reason for doing this was that the invalid pattern could come either from .bazaar/ignore of .bzrignore and I wanted to display a message indicating the bad pattern along with the specific file information. E.g.

[a1234]% ~/src/bzr.dev/300062-better-handling-for-invalid-ignore-pattern/bzr st
bzr: error: Invalid pattern(s) found in "/home/parthm/tmp/a1234/.bzrignore":
  RE:*.x
bzr: error: Invalid pattern(s) found in "/home/parthm/.bazaar/ignore":
  RE:[
bzr: ERROR: Invalid pattern(s) found.
[a1234]%

We could use the Globster state to display the message but the user would need to check both files. I would expect that the probability of an error in .bzrignore is higher but it just seems nicer to indicate the file.
Thoughts?

> 4) so for the construct here:
> 105 + except errors.InvalidPattern, e:
> 106 + # lazy_regex throws InvalidPattern in case of bad pattern.
> 107 + # For Globbing, the bad pattern may have come from an ignore
> 108 + # file, so we should any bad patterns that the ignore file(s)
> 109 + # may have.
> 110 + show_pattern_errors()
> 111 + raise errors.InvalidPattern(str(e))
>
> It seems better to me to do the formatting and put that on the
> errors.InvalidPattern that is getting raised, rather than calling
> show_pattern_errors() and having that display.
>
> I also find it strange to catch InvalidPattern and then raise another
> InvalidPattern(str(first_e))
>
> Especially given your formatting for InvalidPattern this would become:
> InvalidPattern('Invalid pattern(s) found.')
>
> Which would lose any sort of context that the original error...

Read more...

5290. By Parth Malwankar on 2010-06-29

merged changes from trunk

5291. By Parth Malwankar on 2010-06-29

removed re_compile_checked as re.compile shows InvalidPattern error.
Fixed test cases.

John A Meinel (jameinel) wrote :
Download full text (4.4 KiB)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...

>> It means that anyone using Globster for anything that isn't *bzr* (imagine
>> qbzr using it for something) is going to get strange results. (As it will tell
>> them about problems with something that isn't related at all.)
>>
>
> One reason for doing this was that the invalid pattern could come either from .bazaar/ignore of .bzrignore and I wanted to display a message indicating the bad pattern along with the specific file information. E.g.

Could we just include this information as part of the Globster from the
beginning? I understand that knowing where the pattern is bad is
helpful. I'm just concerned that it is using state after-the-fact to
debug a current failure. And anything that isn't *bzr* is likely to be
wrong here.

>
> [a1234]% ~/src/bzr.dev/300062-better-handling-for-invalid-ignore-pattern/bzr st
> bzr: error: Invalid pattern(s) found in "/home/parthm/tmp/a1234/.bzrignore":
> RE:*.x
> bzr: error: Invalid pattern(s) found in "/home/parthm/.bazaar/ignore":
> RE:[
> bzr: ERROR: Invalid pattern(s) found.
> [a1234]%
>
> We could use the Globster state to display the message but the user would need to check both files. I would expect that the probability of an error in .bzrignore is higher but it just seems nicer to indicate the file.
> Thoughts?

Pass in the name of the file associated with each pattern as part of
creating the Globster. Alternatively just give the short 'invalid
pattern found'. Better (IMO) than re-parsing the files.

...

>>
>> ^- You make it so that 'ignores.add_unique_user_ignores()' allows an invalid
>> pattern. It seems more logical to fault that at insert time, rather than at
>> 'bzr add' time. (We probably still need to fail there, but we should first
>> forbid people from adding broken entries in the first place.)
>>
>
> This is done in cmd_ignore.run. This was the "Skipped invalid pattern(s)" warning. So the bad pattern doesn't reach this point. This test case injects the bad pattern to handle the case of when the user might add the bad pattern using and external editor.
>
>

It seems better to me to make our internals robust and hard to subvert
accidentally, rather than assuming that will be done at the cmd_* layer.
Certainly qbzr or explorer is likely to end up with their own UI for
adding ignores, and it would be nice if they could also have invalid
ignores captured early rather than late. (Ideally without having to
re-implement it.)

>
>
>> ^- I would probably say that we should fail all insertions if one is invalid,
>> but we would want it to be atomic. eg if the do 'bzr ignore valid valid
>> invalid valid' either all valids get inserted [as you have today] or nothing
>> gets inserted. You don't want to add the first two, and then fail, as it
>> leaves things in an unknown state.
>>
>> I would probably tend towards 'validate early and fail', as long as we are
>> choosing that for other commands. (I would still like to see status try even
>> if it gets a bogus ignore, but I understand why Robert feels so strongly about
>> 'add' even if I probably disagree.)
>>
>> Also, it is a little bit odd that if you *have* an invalid entry it fails wi...

Read more...

Unmerged revisions

5291. By Parth Malwankar on 2010-06-29

removed re_compile_checked as re.compile shows InvalidPattern error.
Fixed test cases.

5290. By Parth Malwankar on 2010-06-29

merged changes from trunk

5289. By Parth Malwankar on 2010-06-22

better invalid pattern error handling.
Invalid patterns are now handled in a single location,
bzrlib.globbing.Globster.

5288. By Parth Malwankar on 2010-06-21

merged in changes from trunk

5287. By Parth Malwankar on 2010-06-01

fixed scope error due to lazy ui import

5286. By Parth Malwankar on 2010-06-01

filter_invalid_patterns now returns invalid and valid patterns.
dispay is left to the caller.

5285. By Parth Malwankar on 2010-06-01

added unit test for is_pattern_valid and filter_invalid_patterns

5284. By Parth Malwankar on 2010-06-01

unit tests for get_pattern_errors

5283. By Parth Malwankar on 2010-06-01

improved comments

5282. By Parth Malwankar on 2010-06-01

updated NEWS.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2010-06-28 21:45:10 +0000
3+++ NEWS 2010-06-29 04:04:25 +0000
4@@ -84,6 +84,14 @@
5 test that all commands available to the test suite have help.
6 (Robert Collins, #177500)
7
8+* Improved handling of invalid ignore patterns. ``ignore`` rejects invalid
9+ patterns with a warning and does not add them to branch ignore file.
10+ In case of an invalid pattern in ignore files, bzr commands accessing
11+ the ignores fail with a clear error message indicating the invalid pattern
12+ and file. Invalid regular expression will now display a bzr error
13+ ``InvalidPattern`` instead of a stack trace.
14+ (Parth Malwankar, #300062)
15+
16 * Progress output is cleaned up when exiting. (Aaron Bentley)
17
18 * Raise ValueError instead of a string exception.
19
20=== modified file 'bzrlib/builtins.py'
21--- bzrlib/builtins.py 2010-06-17 08:53:15 +0000
22+++ bzrlib/builtins.py 2010-06-29 04:04:25 +0000
23@@ -2717,18 +2717,27 @@
24 (len(name_pattern) > 1 and name_pattern[1] == ':')):
25 raise errors.BzrCommandError(
26 "NAME_PATTERN should not be an absolute path")
27+ name_pattern_list, invalid_patterns = \
28+ globbing.filter_invalid_patterns(name_pattern_list)
29+ if invalid_patterns:
30+ pats = '\n '.join(invalid_patterns)
31+ ui.ui_factory.show_warning(
32+ 'Skipped invalid pattern(s):\n %s' % pats)
33+ if not name_pattern_list:
34+ # After removing invalid patterns, the list may be
35+ # empty. There is nothing to do in this case.
36+ return
37 tree, relpath = WorkingTree.open_containing(directory)
38 ignores.tree_ignores_add_patterns(tree, name_pattern_list)
39 ignored = globbing.Globster(name_pattern_list)
40 matches = []
41- tree.lock_read()
42+ self.add_cleanup(tree.lock_read().unlock)
43 for entry in tree.list_files():
44 id = entry[3]
45 if id is not None:
46 filename = entry[0]
47 if ignored.match(filename):
48 matches.append(filename)
49- tree.unlock()
50 if len(matches) > 0:
51 self.outf.write("Warning: the following files are version controlled and"
52 " match your ignore pattern:\n%s"
53
54=== modified file 'bzrlib/errors.py'
55--- bzrlib/errors.py 2010-06-17 11:52:13 +0000
56+++ bzrlib/errors.py 2010-06-29 04:04:25 +0000
57@@ -3156,3 +3156,10 @@
58 'E.g. bzr whoami "Your Name <name@example.com>"')
59
60
61+class InvalidPattern(BzrError):
62+
63+ _fmt = ('Invalid pattern(s) found. %(message)s')
64+
65+ def __init__(self, message):
66+ self.message = message
67+
68
69=== modified file 'bzrlib/globbing.py'
70--- bzrlib/globbing.py 2010-02-17 17:11:16 +0000
71+++ bzrlib/globbing.py 2010-06-29 04:04:25 +0000
72@@ -21,10 +21,19 @@
73 """
74
75 import re
76+import errno
77
78+import bzrlib
79+from bzrlib.lazy_regex import real_re_compile
80 from bzrlib.trace import (
81 warning
82 )
83+from bzrlib import (
84+ config,
85+ bzrdir,
86+ errors,
87+ ui,
88+ )
89
90
91 class Replacer(object):
92@@ -209,10 +218,24 @@
93
94 :return A matching pattern or None if there is no matching pattern.
95 """
96- for regex, patterns in self._regex_patterns:
97- match = regex.match(filename)
98- if match:
99- return patterns[match.lastindex -1]
100+ try:
101+ for regex, patterns in self._regex_patterns:
102+ match = regex.match(filename)
103+ if match:
104+ return patterns[match.lastindex -1]
105+ except errors.InvalidPattern, e:
106+ # lazy_regex throws InvalidPattern in case of bad pattern.
107+ # For Globbing, the bad pattern may have come from an ignore
108+ # file, so we should display any bad patterns that the ignore
109+ # file(s) may have.
110+ # The invalid pattern message (e.message) is specific to a
111+ # pattern that re.compile failed to compile. As we show all
112+ # possible pattern errors in .bazaar/ignore and .bzrignore
113+ # we set the message to '' and causing InvalidPattern to show
114+ # a generic error message.
115+ show_pattern_errors()
116+ e.message = ''
117+ raise e
118 return None
119
120 class ExceptionGlobster(object):
121@@ -283,3 +306,110 @@
122 if len(pattern) > 1:
123 pattern = pattern.rstrip('/')
124 return pattern
125+
126+
127+def is_pattern_valid(pattern):
128+ """Returns True if pattern is valid.
129+
130+ :param pattern: Normalized pattern.
131+ is_pattern_valid() assumes pattern to be normalized.
132+ see: normalize_pattern
133+ """
134+ result = True
135+ translator = None
136+ if pattern.startswith(u'RE:') or u'/' in pattern:
137+ translator = _sub_fullpath
138+ elif pattern.startswith(u'*.'):
139+ translator = _sub_extension
140+ else:
141+ translator = _sub_basename
142+ tpattern = '(%s)' % translator(pattern)
143+ try:
144+ cpattern = real_re_compile(tpattern)
145+ except re.error, e:
146+ result = False
147+ return result
148+
149+
150+def filter_invalid_patterns(patterns):
151+ """Returns valid and invalid patterns.
152+
153+ :param patterns: List of normalized patterns.
154+ filter_invalid_patterns() assumes pattern to be normalized.
155+ see: normalize_pattern
156+ """
157+ valid_patterns = []
158+ invalid_patterns = []
159+ for pat in patterns:
160+ if is_pattern_valid(pat):
161+ valid_patterns.append(pat)
162+ else:
163+ invalid_patterns.append(pat)
164+ return valid_patterns, invalid_patterns
165+
166+
167+def get_pattern_errors():
168+ """Return two tuples with information about invalid patterns.
169+
170+ :return: This returns two tuples in the format (path, patterns).
171+ path is the full path to either the ignores file or .bzrignore.
172+ patterns is a list of invalid patterns found in that file.
173+
174+ see also: show_pattern_errors()
175+ """
176+ # import ignores locally to avoid circular dependency
177+ # with bzrlib.ignores. We need ignores.parse_ignore_file
178+ # while bzrlib.ignores already uses bzrlib.globbing.
179+ from bzrlib import ignores
180+ def _get_errors(path):
181+ invalid_patterns = []
182+ f = None
183+ try:
184+ try:
185+ f = open(path, 'rb')
186+ patterns = ignores.parse_ignore_file(f)
187+ for p in patterns:
188+ if not is_pattern_valid(p):
189+ invalid_patterns.append(p)
190+ except (IOError, OSError), e:
191+ # open() shouldn't return an IOError without errno,
192+ # but just in case
193+ err = getattr(e, 'errno', None)
194+ if err != errno.ENOENT:
195+ # Its not necessary that the user with have
196+ # .bzrignore or .bazaar/ignore. If we are in
197+ # _get_errors she definitely has at least one,
198+ # but not necessarily both. So we ignore
199+ # errno.ENOENT
200+ raise e
201+ finally:
202+ if f:
203+ f.close()
204+ return invalid_patterns
205+
206+ wt, branch, relpath = \
207+ bzrdir.BzrDir.open_containing_tree_or_branch('.')
208+ bzrignore = wt.abspath(bzrlib.IGNORE_FILENAME)
209+ if wt:
210+ bzr_ignore_patterns = _get_errors(bzrignore)
211+ else:
212+ bzrignore = []
213+ bzr_ignore_patterns = []
214+ user_ignore = config.user_ignore_config_filename()
215+ user_ignore_patterns = _get_errors(user_ignore)
216+ return (bzrignore, bzr_ignore_patterns), (user_ignore, user_ignore_patterns)
217+
218+
219+def show_pattern_errors():
220+ """Displays a warning about invalid patterns found.
221+
222+ Display warning(s) with the list of invalid patterns
223+ found in ignores file or .bzrignore.
224+ """
225+ invalid_pattern_info = get_pattern_errors()
226+ for path, invalid_patterns in invalid_pattern_info:
227+ if invalid_patterns:
228+ pattern_str = '\n '.join(invalid_patterns)
229+ ui.ui_factory.show_error('Invalid pattern(s) found in "%s":\n %s'
230+ % (path, pattern_str))
231+
232
233=== modified file 'bzrlib/ignores.py'
234--- bzrlib/ignores.py 2010-05-03 04:51:58 +0000
235+++ bzrlib/ignores.py 2010-06-29 04:04:25 +0000
236@@ -44,10 +44,9 @@
237 ]
238
239
240-
241 def parse_ignore_file(f):
242 """Read in all of the lines in the file and turn it into an ignore list
243-
244+
245 Continue in the case of utf8 decoding errors, and emit a warning when
246 such and error is found. Optimise for the common case -- no decoding
247 errors.
248@@ -176,7 +175,6 @@
249 """Get the current set of runtime ignores."""
250 return _runtime_ignores
251
252-
253 def tree_ignores_add_patterns(tree, name_pattern_list):
254 """Add more ignore patterns to the ignore file in a tree.
255 If ignore file does not exist then it will be created.
256@@ -227,3 +225,4 @@
257
258 if not tree.path2id(bzrlib.IGNORE_FILENAME):
259 tree.add([bzrlib.IGNORE_FILENAME])
260+
261
262=== modified file 'bzrlib/lazy_regex.py'
263--- bzrlib/lazy_regex.py 2009-03-23 14:59:43 +0000
264+++ bzrlib/lazy_regex.py 2010-06-29 04:04:25 +0000
265@@ -22,6 +22,8 @@
266
267 import re
268
269+from bzrlib import errors
270+
271
272 class LazyRegex(object):
273 """A proxy around a real regex, which won't be compiled until accessed."""
274@@ -58,7 +60,12 @@
275
276 def _real_re_compile(self, *args, **kwargs):
277 """Thunk over to the original re.compile"""
278- return _real_re_compile(*args, **kwargs)
279+ try:
280+ return real_re_compile(*args, **kwargs)
281+ except re.error, e:
282+ # raise InvalidPattern instead of re.error as this gives a
283+ # cleaner message to the user.
284+ raise errors.InvalidPattern('"' + args[0] + '" ' +str(e))
285
286 def __getattr__(self, attr):
287 """Return a member from the proxied regex object.
288@@ -97,11 +104,11 @@
289 Though the first call will reset back to the original (it doesn't
290 track nesting level)
291 """
292- re.compile = _real_re_compile
293-
294-
295-_real_re_compile = re.compile
296-if _real_re_compile is lazy_compile:
297+ re.compile = real_re_compile
298+
299+
300+real_re_compile = re.compile
301+if real_re_compile is lazy_compile:
302 raise AssertionError(
303 "re.compile has already been overridden as lazy_compile, but this would" \
304 " cause infinite recursion")
305
306=== modified file 'bzrlib/log.py'
307--- bzrlib/log.py 2010-06-17 08:53:15 +0000
308+++ bzrlib/log.py 2010-06-29 04:04:25 +0000
309@@ -86,7 +86,6 @@
310 format_date,
311 format_date_with_offset_in_original_timezone,
312 get_terminal_encoding,
313- re_compile_checked,
314 terminal_width,
315 )
316 from bzrlib.symbol_versioning import (
317@@ -811,8 +810,7 @@
318 """
319 if search is None:
320 return log_rev_iterator
321- searchRE = re_compile_checked(search, re.IGNORECASE,
322- 'log message filter')
323+ searchRE = re.compile(search, re.IGNORECASE)
324 return _filter_message_re(searchRE, log_rev_iterator)
325
326
327
328=== modified file 'bzrlib/osutils.py'
329--- bzrlib/osutils.py 2010-06-25 20:34:05 +0000
330+++ bzrlib/osutils.py 2010-06-29 04:04:25 +0000
331@@ -2169,29 +2169,6 @@
332 raise
333
334
335-def re_compile_checked(re_string, flags=0, where=""):
336- """Return a compiled re, or raise a sensible error.
337-
338- This should only be used when compiling user-supplied REs.
339-
340- :param re_string: Text form of regular expression.
341- :param flags: eg re.IGNORECASE
342- :param where: Message explaining to the user the context where
343- it occurred, eg 'log search filter'.
344- """
345- # from https://bugs.launchpad.net/bzr/+bug/251352
346- try:
347- re_obj = re.compile(re_string, flags)
348- re_obj.search("")
349- return re_obj
350- except re.error, e:
351- if where:
352- where = ' in ' + where
353- # despite the name 'error' is a type
354- raise errors.BzrCommandError('Invalid regular expression%s: %r: %s'
355- % (where, re_string, e))
356-
357-
358 if sys.platform == "win32":
359 import msvcrt
360 def getchar():
361
362=== modified file 'bzrlib/tests/__init__.py'
363--- bzrlib/tests/__init__.py 2010-06-25 06:12:56 +0000
364+++ bzrlib/tests/__init__.py 2010-06-29 04:04:25 +0000
365@@ -2742,8 +2742,7 @@
366 :param pattern: A regular expression string.
367 :return: A callable that returns True if the re matches.
368 """
369- filter_re = osutils.re_compile_checked(pattern, 0,
370- 'test filter')
371+ filter_re = re.compile(pattern, 0)
372 def condition(test):
373 test_id = test.id()
374 return filter_re.search(test_id)
375
376=== modified file 'bzrlib/tests/blackbox/test_add.py'
377--- bzrlib/tests/blackbox/test_add.py 2010-02-23 07:43:11 +0000
378+++ bzrlib/tests/blackbox/test_add.py 2010-06-29 04:04:25 +0000
379@@ -18,8 +18,10 @@
380 """Tests of the 'bzr add' command."""
381
382 import os
383+import re
384
385 from bzrlib import (
386+ ignores,
387 osutils,
388 tests,
389 )
390@@ -217,3 +219,31 @@
391 os.symlink(osutils.abspath('target'), 'tree/link')
392 out = self.run_bzr(['add', 'tree/link'])[0]
393 self.assertEquals(out, 'adding link\n')
394+
395+ def test_add_invalid_ignore(self):
396+ """add should handle invalid ignore pattern gracefully.
397+ """
398+ tree = self.make_branch_and_tree('.')
399+ self.build_tree(['top.txt'])
400+ self.build_tree_contents([('.bzrignore', 'RE:*.cpp\n')])
401+ out, err = self.run_bzr(['add'], 3)
402+ self.assertEquals(out, '')
403+ self.assertContainsRe(err,
404+ ('error: Invalid pattern.*RE:\*\.cpp.*ERROR: '
405+ 'Invalid pattern'),
406+ re.DOTALL)
407+
408+ def test_add_invalid_ignore_in_global_ignore(self):
409+ """add should handle invalid global ignore pattern gracefully.
410+ """
411+ tree = self.make_branch_and_tree('.')
412+ self.build_tree(['top.txt'])
413+ self.build_tree_contents([('foobar', 'foo\n')])
414+ ignores.add_unique_user_ignores([u"RE:*.cpp"]) # invalid pattern
415+ out, err = self.run_bzr(['add'], 3)
416+ self.assertEquals(out, '')
417+ self.assertContainsRe(err,
418+ ('error: Invalid pattern.*RE:\*\.cpp.*ERROR: '
419+ 'Invalid pattern'),
420+ re.DOTALL)
421+
422
423=== modified file 'bzrlib/tests/blackbox/test_ignore.py'
424--- bzrlib/tests/blackbox/test_ignore.py 2010-06-11 07:32:12 +0000
425+++ bzrlib/tests/blackbox/test_ignore.py 2010-06-29 04:04:25 +0000
426@@ -162,3 +162,36 @@
427 tree = self.make_branch_and_tree('a')
428 self.run_bzr(['ignore', '--directory=a', 'README'])
429 self.check_file_contents('a/.bzrignore', 'README\n')
430+
431+ def test_ignored_invalid_pattern(self):
432+ """Ensure graceful handling for invalid ignore pattern.
433+
434+ Test case for #300062.
435+ Invalid pattern should show clear error message.
436+ Invalid pattern should not be added to .bzrignore file.
437+ """
438+ tree = self.make_branch_and_tree('.')
439+ out, err = self.run_bzr(['ignore', 'RE:*.cpp'])
440+ self.assertEqual(out, '')
441+ self.assertContainsRe(err,
442+ 'bzr: warning: Skipped invalid pattern.*RE:\*\.cpp', re.DOTALL)
443+ self.assertFalse(os.path.isfile('.bzrignore'))
444+
445+ def test_invalid_pattern_in_ignore_file(self):
446+ """Existing invalid patterns should be shown.
447+ """
448+ tree = self.make_branch_and_tree('.')
449+ self.build_tree(['file'])
450+ self.build_tree_contents([('.bzrignore', 'RE:[\n')])
451+ out, err = self.run_bzr(['ignore', 'RE:*.cpp', 'abc'], 3)
452+ self.assertEqual(out, '')
453+ self.assertContainsRe(err, 'warning: Skipped invalid pattern'
454+ '.*RE:\*\.cpp.*'
455+ 'error: Invalid pattern.*RE:\[', re.DOTALL)
456+ f = open('.bzrignore', 'r')
457+ try:
458+ lines = f.readlines()
459+ self.assertTrue('abc\n' in lines)
460+ finally:
461+ f.close()
462+
463
464=== modified file 'bzrlib/tests/blackbox/test_ignored.py'
465--- bzrlib/tests/blackbox/test_ignored.py 2010-06-11 07:32:12 +0000
466+++ bzrlib/tests/blackbox/test_ignored.py 2010-06-29 04:04:25 +0000
467@@ -17,6 +17,8 @@
468
469 """Tests of the 'bzr ignored' command."""
470
471+import re
472+
473 from bzrlib.tests import TestCaseWithTransport
474
475
476@@ -46,3 +48,15 @@
477 ('a/.bzrignore', 'README')])
478 out, err = self.run_bzr(['ignored', '--directory=a'])
479 self.assertStartsWith(out, 'README')
480+
481+ def test_invalid_pattern_in_ignore_file(self):
482+ """Existing invalid patterns should be shown.
483+ """
484+ tree = self.make_branch_and_tree('.')
485+ self.build_tree(['file'])
486+ self.build_tree_contents([('.bzrignore', 'RE:[\n')])
487+ out, err = self.run_bzr(['ignored'], 3)
488+ self.assertEqual(out, '')
489+ self.assertContainsRe(err, 'error: Invalid pattern.*RE:\[.*ERROR:',
490+ re.DOTALL)
491+
492
493=== modified file 'bzrlib/tests/blackbox/test_log.py'
494--- bzrlib/tests/blackbox/test_log.py 2010-06-25 08:01:24 +0000
495+++ bzrlib/tests/blackbox/test_log.py 2010-06-29 04:04:25 +0000
496@@ -386,11 +386,8 @@
497 """
498 self.make_minimal_branch()
499 out, err = self.run_bzr(['log', '-m', '*'], retcode=3)
500- self.assertEqual("bzr: ERROR: Invalid regular expression"
501- " in log message filter"
502- ": '*'"
503- ": nothing to repeat\n", err)
504- self.assertEqual('', out)
505+ self.assertContainsRe(err, "ERROR.*Invalid pattern.*nothing to repeat")
506+ self.assertEqual(out, '')
507
508 def test_log_unsupported_timezone(self):
509 self.make_linear_branch()
510
511=== modified file 'bzrlib/tests/blackbox/test_status.py'
512--- bzrlib/tests/blackbox/test_status.py 2010-04-07 21:34:13 +0000
513+++ bzrlib/tests/blackbox/test_status.py 2010-06-29 04:04:25 +0000
514@@ -24,12 +24,14 @@
515
516 from cStringIO import StringIO
517 import codecs
518+import re
519 from os import mkdir, chdir, rmdir, unlink
520 import sys
521 from tempfile import TemporaryFile
522
523 from bzrlib import (
524 bzrdir,
525+ ignores,
526 conflicts,
527 errors,
528 osutils,
529@@ -701,6 +703,57 @@
530 out, err = self.run_bzr('status tree/a')
531 self.assertNotContainsRe(out, 'pending merge')
532
533+ def test_status_invalid_ignore_pattern(self):
534+ """Tests status with invalid ignore pattern"""
535+ wt = self.make_branch_and_tree('.')
536+ self.build_tree_contents([('.bzrignore', 'RE:*.cpp\n')])
537+ wt.commit('commit .bzrignore')
538+ out, err = self.run_bzr(['status'], 3)
539+ self.assertEquals(out, '')
540+ self.assertContainsRe(err, 'error: Invalid pattern.*RE:\*\.cpp.*ERROR',
541+ flags=re.DOTALL)
542+
543+ def test_status_invalid_ignore_pattern_in_global_ignore(self):
544+ """Tests status with invalid ignore pattern in .bazaar/ignore"""
545+ wt = self.make_branch_and_tree('.')
546+ self.build_tree(['file'])
547+ wt.add('file')
548+ wt.commit('one')
549+ # add unknown file
550+ f = open('foobar', 'w')
551+ try:
552+ f.write(u"foo\n") # invalid pattern
553+ finally:
554+ f.close()
555+ ignores.add_unique_user_ignores([u"RE:*.cpp"]) # invalid pattern
556+ out, err = self.run_bzr(['status'], 3)
557+ self.assertEquals(out, '')
558+ self.assertContainsRe(err, 'error: Invalid pattern.*RE:\*\.cpp.*ERROR',
559+ flags=re.DOTALL)
560+
561+ def test_short_status_invalid_ignore_pattern(self):
562+ """Tests short status with invalid ignore pattern"""
563+ wt = self.make_branch_and_tree('.')
564+ self.build_tree_contents([('.bzrignore', 'RE:*.cpp\n')])
565+ wt.commit('commit .bzrignore')
566+ out, err = self.run_bzr(['status', '-S'], 3)
567+ self.assertEquals(out, '')
568+ self.assertContainsRe(err, 'error: Invalid pattern.*RE:\*\.cpp.*ERROR',
569+ flags=re.DOTALL)
570+
571+ def test_specific_status_invalid_ignore_pattern(self):
572+ """Tests specific status with invalid ignore pattern"""
573+ wt = self.make_branch_and_tree('.')
574+ self.build_tree(['file'])
575+ wt.add('file')
576+ wt.commit('added file')
577+ self.build_tree_contents([('.bzrignore', 'RE:*.cpp\n')])
578+ wt.commit('commit .bzrignore')
579+ out, err = self.run_bzr(['status', 'file'], 3)
580+ self.assertEquals(out, '')
581+ self.assertContainsRe(err, 'error: Invalid pattern.*RE:\*\.cpp.*ERROR',
582+ flags=re.DOTALL)
583+
584
585 class TestStatusEncodings(TestCaseWithTransport):
586
587
588=== modified file 'bzrlib/tests/test_globbing.py'
589--- bzrlib/tests/test_globbing.py 2010-02-17 17:11:16 +0000
590+++ bzrlib/tests/test_globbing.py 2010-06-29 04:04:25 +0000
591@@ -15,15 +15,20 @@
592 # along with this program; if not, write to the Free Software
593 # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
594
595+from bzrlib import config, ignores
596 from bzrlib.globbing import (
597 Globster,
598 ExceptionGlobster,
599+ filter_invalid_patterns,
600+ get_pattern_errors,
601+ is_pattern_valid,
602 _OrderedGlobster,
603 normalize_pattern
604 )
605 from bzrlib.tests import (
606 TestCase,
607 TestCaseInTempDir,
608+ TestCaseWithTransport,
609 )
610
611
612@@ -371,3 +376,48 @@
613 """tests that multiple mixed slashes are collapsed to single forward
614 slashes and trailing mixed slashes are removed"""
615 self.assertEqual(u'/foo/bar', normalize_pattern(u'\\/\\foo//\\///bar/\\\\/'))
616+
617+class TestInvalidPattern(TestCaseWithTransport):
618+
619+ def test_is_pattern_valid(self):
620+ """Ensure is_pattern_valid() detects invalid patterns.
621+ """
622+ self.assertTrue(is_pattern_valid(normalize_pattern(u'foo')))
623+ self.assertFalse(is_pattern_valid(normalize_pattern(u'RE:*.cpp')))
624+
625+ def test_filter_invalid_patterns(self):
626+ """Ensure filter_invalid_patterns() returns only valid patterns.
627+ """
628+ patterns = [u'abc', u'RE:*.cpp', u'[', u'def']
629+ patterns = [normalize_pattern(p) for p in patterns]
630+ ref_valid_patterns = set([u'abc', u'def'])
631+ ref_invalid_patterns = set([u'RE:*.cpp', u'['])
632+ valid_patterns, invalid_patterns = filter_invalid_patterns(patterns)
633+ self.assertEqual(ref_valid_patterns, set(valid_patterns))
634+ self.assertEqual(ref_invalid_patterns, set(invalid_patterns))
635+
636+ def test_invalid_ignores(self):
637+ """Ensure that invalid user ignores are detected.
638+ """
639+ ignore_path = config.user_ignore_config_filename()
640+ user_ignores = ignores.get_user_ignores()
641+ f = open(ignore_path, 'wb')
642+ try:
643+ f.write(u'foo\nRE:*.cpp\nbar\nRE:[\nbaz')
644+ finally:
645+ f.close()
646+ (bi, bip), (ui, uip) = get_pattern_errors()
647+ self.assertEquals(set(uip), set([u'RE:*.cpp', u'RE:[']))
648+ self.assertEquals(bip, [])
649+
650+ def test_invalid_bzrignore(self):
651+ """Ensure that invalid .bzrignore entries are detected.
652+ """
653+ tree = self.make_branch_and_tree(".")
654+ self.build_tree_contents([('.bzrignore',
655+ "foo\nRE:*.cpp\nbar\nRE:[\nbaz\n")])
656+ tree.add([".bzrignore"])
657+ (bi, bip), (ui, uip) = get_pattern_errors()
658+ self.assertEquals(set(bip), set([u'RE:*.cpp', u'RE:[']))
659+ self.assertEquals(uip, [])
660+
661
662=== modified file 'bzrlib/tests/test_ignores.py'
663--- bzrlib/tests/test_ignores.py 2010-05-03 04:51:58 +0000
664+++ bzrlib/tests/test_ignores.py 2010-06-29 04:04:25 +0000
665@@ -217,3 +217,4 @@
666 ignores.tree_ignores_add_patterns(tree, ["myentry2", "foo"])
667 self.assertEquals(open('.bzrignore', 'rb').read(), 'myentry1\r\nmyentry2\r\nfoo\r\n')
668 self.assertPatternsEquals(["myentry1", "myentry2", "foo"])
669+
670
671=== modified file 'bzrlib/tests/test_osutils.py'
672--- bzrlib/tests/test_osutils.py 2010-06-08 01:45:09 +0000
673+++ bzrlib/tests/test_osutils.py 2010-06-29 04:04:25 +0000
674@@ -1703,24 +1703,6 @@
675 self.assertRaises(IOError, osutils.resource_string, 'bzrlib', 'yyy.xx')
676
677
678-class TestReCompile(tests.TestCase):
679-
680- def test_re_compile_checked(self):
681- r = osutils.re_compile_checked(r'A*', re.IGNORECASE)
682- self.assertTrue(r.match('aaaa'))
683- self.assertTrue(r.match('aAaA'))
684-
685- def test_re_compile_checked_error(self):
686- # like https://bugs.launchpad.net/bzr/+bug/251352
687- err = self.assertRaises(
688- errors.BzrCommandError,
689- osutils.re_compile_checked, '*', re.IGNORECASE, 'test case')
690- self.assertEqual(
691- "Invalid regular expression in test case: '*': "
692- "nothing to repeat",
693- str(err))
694-
695-
696 class TestDirReader(tests.TestCaseInTempDir):
697
698 # Set by load_tests