Merge lp:~parthm/bzr/300062-better-handling-for-invalid-ignore-pattern into lp:bzr

Proposed by Parth Malwankar
Status: Work in progress
Proposed branch: lp:~parthm/bzr/300062-better-handling-for-invalid-ignore-pattern
Merge into: lp:bzr
Diff against target: 698 lines (+360/-66)
17 files modified
NEWS (+8/-0)
bzrlib/builtins.py (+11/-2)
bzrlib/errors.py (+7/-0)
bzrlib/globbing.py (+134/-4)
bzrlib/ignores.py (+2/-3)
bzrlib/lazy_regex.py (+13/-6)
bzrlib/log.py (+1/-3)
bzrlib/osutils.py (+0/-23)
bzrlib/tests/__init__.py (+1/-2)
bzrlib/tests/blackbox/test_add.py (+30/-0)
bzrlib/tests/blackbox/test_ignore.py (+33/-0)
bzrlib/tests/blackbox/test_ignored.py (+14/-0)
bzrlib/tests/blackbox/test_log.py (+2/-5)
bzrlib/tests/blackbox/test_status.py (+53/-0)
bzrlib/tests/test_globbing.py (+50/-0)
bzrlib/tests/test_ignores.py (+1/-0)
bzrlib/tests/test_osutils.py (+0/-18)
To merge this branch: bzr merge lp:~parthm/bzr/300062-better-handling-for-invalid-ignore-pattern
Reviewer Review Type Date Requested Status
John A Meinel Needs Fixing
Robert Collins (community) Needs Fixing
Review via email: mp+26400@code.launchpad.net

Commit message

Improved handling of invalid ignore patterns by add, ignore, ignored and status.

Description of the change

=== Fixes Bug #300062 ===

This started out as a simple bug fix to ensure that 'bzr ignore' handles invalid patterns correctly but as invalid patterns impacted other commands this patch ended up fixing those as well. bzr can encounter invalid patterns in either .bzrignore file or .bazaar/ignore file. These could earlier get in either by using 'bzr ignore' or by editing the files directly. The former is now fixed. This patch fixes the following:
1. 'bzr ignore' when supplied with possibly multiple patterns, filters out invalid patters and only adds valid patterns to .bzrignore

2. 'bzr status' checks ignore patterns before showing status. This was crashing if invalid patterns were present. 'bzr status' now issues a warning if invalid pattern(s) are seen and does not crash. In order to avoid any performance penalty, `status` works as before, but if it sees sees re.error then it parser both ignore files and prints out the bad pattern(s).

3. 'bzr add' checks ignore patterns before adding files. This was crashing if invalid patterns were present. 'bzr add' now exists with an error message indicating invalid patterns. Initially I was trying to decide between a warning or error for this but decided that its better for `add` to not be too smart as it might end up adding files that were meant to be ignored.

Implementations Notes
=====================
As `re` is lazy by default. I have added a function `bzrlib.lazy_regex.eager_re_compile` that is used by `bzrlib.globbing.is_pattern_valid` which in turn is used by to create `bzrlib.globbing.filter_invalid_patterns`. This is used by `ignore` command to ensure that no junk enters .bzrignore.

`bzrlib.ignores` has two new functions, `get_pattern_errors` gives information about bad patterns in .bzrignore and .bazaar/ignore. `show_pattern_errors` is used to display warning about bad patterns to the user. `show_pattern_errors` is used by `status`. `get_pattern_errors` is used by `add` which raises error `InvalidIgnorePattern`.

I noticed that imports in bzrlib.status and bzrlib.ignores were not lazy. So I have made these lazy and I had to add a few imports to these files.

Tests have been added as needed.

To post a comment you must log in.
Revision history for this message
Robert Collins (lifeless) wrote :

So, this is comprehensive, and I like that. I have a general point,
and a couple of specifics. I haven't done a line by line reading [yet,
 I will after the general point is discussed].

Firstly, the general point. I think its fine, if the ignore file is
damaged to just stop. add doesn't need to be able to succeed when the
environment is broken. Stop (let the exception propogate) and have the
user fix the issue. Better that than adding something they wanted
ignored, I think. Similarly for status.

Now, I might be misreading, and thats what you're doing : but if thats
what you're doing, its leaking up into code that it shouldn't. So I'd
like to see this patch with no changes to code that uses ignores - no
changes should be needed to do what I've described. This is because I
don't think the approach of scattering try/excepts around in higher
level code makes sense: push it down into the ignores module.

Secondly, the specific issue - to expose the real re compile function,
just assign it to a variable that can be accessed: wrappers of the
form:
def a_foo(param):
   result = real_foo(param)
   return result

can be rewritten as
def a_foo(param):
    return real_foo(param)

which is precisely equivalent to
a_foo = real_foo

So in this case, just call the _real_re_compile function, as is. No
changes needed.

sidenote: I rather suspect theres a bunch of edits here we may end up
not wanting, and perhaps we'll want to flatten the history to avoid
that overhead.

-Rob

Revision history for this message
Robert Collins (lifeless) wrote :

Oh, and one more thing: the tests you're writing are very far away
from the code you're testing; please consider using tests for the
specific code, not blackbox tests. blackbox tests should really only
be needed when you're changing code in builtins.py or the few other
command-class containing files.

-Rob

Revision history for this message
Parth Malwankar (parthm) wrote :

> So, this is comprehensive, and I like that. I have a general point,
> and a couple of specifics. I haven't done a line by line reading [yet,
> I will after the general point is discussed].
>
> Firstly, the general point. I think its fine, if the ignore file is
> damaged to just stop. add doesn't need to be able to succeed when the
> environment is broken. Stop (let the exception propogate) and have the
> user fix the issue. Better that than adding something they wanted
> ignored, I think. Similarly for status.
>

I have update `status` to fail as well. `add` was already raising an
exception for invalid pattern. Making it consistent does make the
code cleaner.

> Secondly, the specific issue - to expose the real re compile function,
> just assign it to a variable that can be accessed: wrappers of the
> form:
> def a_foo(param):
> result = real_foo(param)
> return result
>
> can be rewritten as
> def a_foo(param):
> return real_foo(param)
>
> which is precisely equivalent to
> a_foo = real_foo
>
>
> So in this case, just call the _real_re_compile function, as is. No
> changes needed.
>

I was reluctant to use _real_re_compile as it was private. Its now
renamed to real_re_compile and used directly.

> sidenote: I rather suspect theres a bunch of edits here we may end up
> not wanting, and perhaps we'll want to flatten the history to avoid
> that overhead.
>

I am not sure I understand. Do you mean using bzr-rewrite?

> -Rob

Thanks for the review.

Revision history for this message
Parth Malwankar (parthm) wrote :

> Oh, and one more thing: the tests you're writing are very far away
> from the code you're testing; please consider using tests for the
> specific code, not blackbox tests. blackbox tests should really only
> be needed when you're changing code in builtins.py or the few other
> command-class containing files.
>

Done. I have added unit tests for the new functions along with the
blackbox tests.

Revision history for this message
Parth Malwankar (parthm) wrote :

Some other things are fixed in the latest set of changes:
 - `ignored` was crashing on invalid pattern.
 - lock/unlock for ignore wasn't in try/finally or cleanup.
 - As InvalidPattern exception is thrown from the lowest level, any uncaught regex error in bzr will give a InvalidPattern bzr error rather than stack trace. `add`, `ignore`, `ignored`, `status` catch this to give bad ignore pattern info.

Revision history for this message
Robert Collins (lifeless) wrote :

Why do they need to catch it? Why can't the exception __str__ just
show it, avoiding all changes to add/ignore/ignored and status
commands?

-Rob

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> Why do they need to catch it? Why can't the exception __str__ just
> show it, avoiding all changes to add/ignore/ignored and status
> commands?
>
> -Rob

Just mentioning that I don't think falling over and failing on an error
is the best route. When people want to run "bzr status" they want it to
complete. Completely blocking the user because they have a typo in their
ignore file is not going to win "user-friendly" points.

Obviously Robert and I disagree here, though, so we may need someone
else to chime in as a tiebreaker.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwFHfQACgkQJdeBCYSNAAM68QCgkT7TycEQ5NeRcs1rXgfRklrl
Bq8AoKfmmTzMfK+IbGrwaQ41kEuJWWvf
=n8/E
-----END PGP SIGNATURE-----

Revision history for this message
Parth Malwankar (parthm) wrote :

> Why do they need to catch it? Why can't the exception __str__ just
> show it, avoiding all changes to add/ignore/ignored and status
> commands?
>
> -Rob

regex are used in other places like `log -m`. So, a failure
doesn't necessarily mean its an ignore pattern issue. Hence,
the caller needs to decide whats the best way to deal with
it. Also, in some cases we want to show a warning (ignore)
and in others, an error (add).

Revision history for this message
Parth Malwankar (parthm) wrote :

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Robert Collins wrote:
> > Why do they need to catch it? Why can't the exception __str__ just
> > show it, avoiding all changes to add/ignore/ignored and status
> > commands?
> >
> > -Rob
>
> Just mentioning that I don't think falling over and failing on an error
> is the best route. When people want to run "bzr status" they want it to
> complete. Completely blocking the user because they have a typo in their
> ignore file is not going to win "user-friendly" points.
>

Having `status` work seems reasonable to me because in case of failed
ignores the files displayed end up as 'unknown' and the rest of the
status should be the same. The code does get a little ugly as we need
to catch the exception in three places in status.py.

Revision history for this message
Robert Collins (lifeless) wrote :

John, lets explore this a little more.

Possibly the most common case of a broken ignore file will be new
users, learning how to use the system, and running 'bzr st' on a newly
inited tree, showing hundreds or thousands of files.

I don't think a little warning message shown partway through the
output will win user friendly points - and they may well be scratching
their head going 'why won't it ignore what I asked it to': people
learning a new system can often get rattled.

I don't think stopping cold is necessarily better than outputting
details about whats wrong, but I am not at all convinced that
completing-with-potentially-very-many-unknowns is better.

I'm concerned about having more try:excepts in the very innermost loop
of what we do, we've had to take them out of code paths before, and
status is one of the most heavily exercised paths. What I saw of the
earlier diff had try: excepts: inside inner loops, rather than once,
at compilation of the regex time. Data will help (test on a 20K file
tree, with 40K ignored files (.o's and .d's)

Also, as we only compile *once* there is a clear bug - if the
categorisation fails on the first file, and the latter ones just fail
open to unknowns, then first file *also* should fail open to an
unknown, and there is no reason to be raising an exception - just log
directly from the compilation failure point.

Revision history for this message
Martin Pool (mbp) wrote :

On 2 June 2010 00:52, John A Meinel <email address hidden> wrote:
> Just mentioning that I don't think falling over and failing on an error
> is the best route. When people want to run "bzr status" they want it to
> complete. Completely blocking the user because they have a typo in their
> ignore file is not going to win "user-friendly" points.

I would tend to expect it to give a user warning not an error. If
it's unreasonably messy or slow to implement I wouldn't absolutely
insist on it. A user friendly error is an improvement on an internal
error.

--
Martin <http://launchpad.net/~mbp/>

Revision history for this message
Robert Collins (lifeless) wrote :

To be clear, I was suggesting a user user - nicely formatted, and stop
the operation, not a traceback or InternalError.

Revision history for this message
Parth Malwankar (parthm) wrote :

This is how the output currently looks (with status showing an error):

[aaa]% ~/src/bzr.dev/300062-better-handling-for-invalid-ignore-pattern/bzr ignore "RE:[" aaa
bzr: warning: Skipped invalid pattern(s):
  RE:[
========================================================================
[aaa]% echo "RE:[" >> .bzrignore
[aaa]% ~/src/bzr.dev/300062-better-handling-for-invalid-ignore-pattern/bzr st
bzr: error: Invalid pattern(s) found in "/home/parthm/tmp/aaa/.bzrignore":
  RE:[
bzr: ERROR: Invalid pattern(s) found.
========================================================================
[aaa]% ~/src/bzr.dev/300062-better-handling-for-invalid-ignore-pattern/bzr add
bzr: error: Invalid pattern(s) found in "/home/parthm/tmp/aaa/.bzrignore":
  RE:[
bzr: ERROR: Invalid pattern(s) found.
========================================================================
[aaa]% echo "RE:*.cpp" >> ~/.bazaar/ignore
[aaa]% ~/src/bzr.dev/300062-better-handling-for-invalid-ignore-pattern/bzr add
bzr: error: Invalid pattern(s) found in "/home/parthm/tmp/aaa/.bzrignore":
  RE:[
bzr: error: Invalid pattern(s) found in "/home/parthm/.bazaar/ignore":
  RE:*.cpp
bzr: ERROR: Invalid pattern(s) found.
[aaa]%

Revision history for this message
Parth Malwankar (parthm) wrote :
Download full text (3.6 KiB)

I had an interesting discussion with jam regarding warning or error on irc. Unfortunately no conclusion yet :P

<jam> hi parthm
<parthm> jam: hi, i was thinking some more about the `bzr st` bad ignore pattern
i am somewhat divided between warning / error but am leaning towards error. if we have a bad pattern other commands like `add` and `ignored` would fail anyway.
<parthm> i am not sure if there are others. so it may be better to fail quickly and have the user fix it sooner than later.
<parthm> there could also be a case that a user ends up pushing a bad pattern to their server/lp. as .bzrignore is already added. its just a `bzr ci` and `bzr push`.
so failing on `st` may be a good idea.
<jam> parthm: why would add and ignore have to fail?
the point is that, especially right now, people may have a bad config
having it fall over when it used to work is bad
having it fall over *in general* is bad
I don't feel super strongly
but we've done the 'error early' in the past
and generally it has provided worse user experience
as a minor misconfiguration in X
suddenly blocks Y and Z from working at all
<parthm> jam: `ignore` works, it issues a warning and discards the bad pattern.
<jam> even if the pattern was already in the file?
<parthm> `ignored` fails as we compile all ignored patterns (99 at a time) into one pattern. so we cant selectively ignore.
<jam> you certainly *could*
<parthm> jam: yes. if a file has a bad pattern, `ignore` would still work. so it issues a warning about the pattern on the command line. if the tree is processed, then a followup error is shown about the existing bad pattern.
<jam> parthm: the problem is that if we filter out the bad pattern from the disk content we make it harder to get the right pattern
since the old one is gone
it may be a fair trade
but something to be thought about
<parthm> jam: we don't filter out the bad pattern from the file. filtering is only for the command like. we display an error message for the pattern in file.
jam: http://pastebin.com/fMWtEUjc
<jam> so for compiling, we could certainly attempt the 100-way compile and if it fails rollback
<parthm> jam: i am not sure i understand. are you talking about `ignored`?
<jam> yes
but also statu
status
add, etc
<parthm> jam: what do you mean by try the compile and rollback?
<jam> parthm: try to compile all 100, if it fails, compile them individually to find the one that failed
<parthm> jam: thats whats done in the current patch. the current patch allow the operation to happen, if there is an exception, it processes the file to find the bad pattern and displays it along with the error message.
<jam> so why does ignored *have* to fail?
<parthm> jam: so are you suggesting we retry after filtering out the bad pattern at runtime?
<jam> That is what I've been suggesting, yes
<parthm> jam: IMO once the user has a bad pattern in the file he would sooner or later end up with a problem so it may be better to just error out pointing to the pattern.
i don't think existing installation would have a bad pattern as it would make bzr crash.
<jam> parthm: the difference is whether someone can still finish what they were trying to do
*before* they have to fix everything
o...

Read more...

Revision history for this message
Vincent Ladeuil (vila) wrote :

+1 for warning, blocking a user is bad, especially if .bzrignore is
versioned and the said user just got one with a bad pattern (some people
will do that)

Revision history for this message
Parth Malwankar (parthm) wrote :

It seem to me that the majority prefers bzr to be more forgiving about bad patterns. I can go ahead with a warning based approach if there are no objections to this over the next day or two.

Here is what I understand.

1. `add` - This should fail (error) as we don't wan't to add file that the user may want to ignore
2. `ignore` - Show warning and skip any bad patterns found on the command line.
3. `ignored` - Try to show ignored files. If this errors out with bad regex, filter out the faulting regex(es) at runtime and retry. The disk files are not changes. Issue a warning about faulting regexes.
4. `status` - If we see a regex failure, filter out the faulting pattern at runtime and retry the ignored files. The retry based approach should reduce the number of 'unknown' files listed.

This would also impact `ls -i/-u`. I will look into that some more. I suppose we would want a similar retry based behavior for this also.

Any further thoughts/ideas about the user interaction should be? Any more commands that might be impacted?

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Parth Malwankar wrote:
> It seem to me that the majority prefers bzr to be more forgiving about bad patterns. I can go ahead with a warning based approach if there are no objections to this over the next day or two.
>
> Here is what I understand.
>
> 1. `add` - This should fail (error) as we don't wan't to add file that the user may want to ignore

I would actually disagree and just warn. The user can always 'bzr
revert' or 'bzr rm'. Especially if they were supplying explicit paths
("bzr add foo" shouldn't care about ignore patterns, because it always
adds foo even if it was ignored.)

> 2. `ignore` - Show warning and skip any bad patterns found on the command line.
> 3. `ignored` - Try to show ignored files. If this errors out with bad regex, filter out the faulting regex(es) at runtime and retry. The disk files are not changes. Issue a warning about faulting regexes.
> 4. `status` - If we see a regex failure, filter out the faulting pattern at runtime and retry the ignored files. The retry based approach should reduce the number of 'unknown' files listed.
>
> This would also impact `ls -i/-u`. I will look into that some more. I suppose we would want a similar retry based behavior for this also.
>
> Any further thoughts/ideas about the user interaction should be? Any more commands that might be impacted?
>

I would look to do it the 'cleanest' way internally. My feeling is that
turning everything into a warning is likely to be fairly straightforward
and the most worthwhile.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwH7UQACgkQJdeBCYSNAAN3hACgvtuI/Jb1yJ8pJQPsGxwfG0f0
eF8AoI+7+E1fKzOZZ4kq3phkxCJmTEXL
=fiKw
-----END PGP SIGNATURE-----

Revision history for this message
Robert Collins (lifeless) wrote :

I'd like to see a really clean internal behaviour + a good user experience.

I don't agree with John that the user may have a bad ignore file
already: compilation will be breaking, and it will be throwing the
native RE error at the moment, so existing, broken files are fine to
error on. The point is to error:
 - cleanly
 - helpfully
 - in a way they can correct

(and correcting a .bzrignore file is easy)

I have a strong feeling that, particularly for add and status in newly
created trees, erroring is actually more user friendly that warning:
warnings can get lost and ignored, particularly when you're dealing
with hundreds or thousands of lines of output. However, I think any
improvement is better than what we have today, so we should just do
whatever Parth wants to do here, and iterate when we find out its
wrong ;).

In terms of internal code, I'd like it if whatever approach is taken:
 - does not need handling code per-code-path-using-ignores
 - is entirely encapsulated in ignores.py
 - doesn't add a new exception handler to every unversioned file
considered by add

This is totally doable, I think, for either warnings or erroring, and
not something I expect any of us to disagree on:)

-Rob

Revision history for this message
Robert Collins (lifeless) wrote :

Ok, I've looked through this again, and I'd like to suggest a deeper implementation strategy.

From the top level, whats going on here is this:
 - working trees have some user data (ignore rules)
 - we compile the user rules into an intermediary language (a list of regexes which we | together)
 - we lazy-compile that intermediary form into a list of re's we can match things against.

So there are the following translation steps that can fail:
 - ignores line -> string regex
 - string regex -> compiled regex

Constraint wise we don't want to compile re's more than once in normal circumstances.

We don't want to expose internal workings unless needed.

We want all commands that use ignores to benefit.

If you look at ExceptionGlobster and Globster, these objects already know about all the patterns, and they are responsible for the glob->re compilation too. As you have made their match raise InvalidPattern error, its actually entirely reasonable for the outer most one (ExceptionGlobster) to when it has a pattern error, check all of its regexs on the spot, and raise an InvalidPatternError which has all the patterns that are wrong.

That won't affect log because log doesn't using globbing (and if it did that would be ok too, because it would show the bad pattern); it will also help 'rules' if you do this to _OrderedGlobster as well.

So in short:
 - unwind all the changes to builtins and ignores, they are the wrong level and API to be fixing this; change just globbing.py and add the new error to errors.py; add some unit tests for this case to *Globster.match, and we're done.

review: Needs Fixing
5288. By Parth Malwankar

merged in changes from trunk

5289. By Parth Malwankar

better invalid pattern error handling.
Invalid patterns are now handled in a single location,
bzrlib.globbing.Globster.

Revision history for this message
Parth Malwankar (parthm) wrote :

> Ok, I've looked through this again, and I'd like to suggest a deeper
> implementation strategy.

Thanks for the review and comment. I have cleanup the error handling now.
Basically, the following is done:
1. General regex compile failure now raises InvalidPattern error.
2. Invalid pattern errors are now raised only by bzrlib.globbing.Globster.match
3. `bzr ignore PATTERN..` skips invalid patterns passed. Existing errors are handled by bzrlib.globbing.Globster.match.

Revision history for this message
John A Meinel (jameinel) wrote :
Download full text (4.7 KiB)

1) It might be clearer to resubmit this, since it has changed fairly significantly since the beginning. (So I don't think Robert's Needs Fixing still directly applies.)

2)
61 +class InvalidPattern(BzrError):
62 +
63 + _fmt = ('Invalid pattern(s) found.')
64 +
65 + def __init__(self, message):
66 + self.message = message
67 +

^- You set a message, but then never display it (there is nothing in _fmt which shows the attributes.)

I suppose in builtins you do:
29 + if invalid_patterns:
30 + pats = '\n '.join(invalid_patterns)
31 + ui.ui_factory.show_warning(
32 + 'Skipped invalid pattern(s):\n %s' % pats)
^- which explicitly displays the bad patterns, but then you do so in a warning and not an error. Which seems to disagree with your "now raises exceptions" comment.

3) It seems very strange to me that 'show_pattern_errors' is a module-level function that re-reads the ignore files and reports on them.

Rather than, for example, having the state of the Globster used to determine what patterns are invalid.

It means that anyone using Globster for anything that isn't *bzr* (imagine qbzr using it for something) is going to get strange results. (As it will tell them about problems with something that isn't related at all.)

4) so for the construct here:
105 + except errors.InvalidPattern, e:
106 + # lazy_regex throws InvalidPattern in case of bad pattern.
107 + # For Globbing, the bad pattern may have come from an ignore
108 + # file, so we should any bad patterns that the ignore file(s)
109 + # may have.
110 + show_pattern_errors()
111 + raise errors.InvalidPattern(str(e))

It seems better to me to do the formatting and put that on the errors.InvalidPattern that is getting raised, rather than calling show_pattern_errors() and having that display.

I also find it strange to catch InvalidPattern and then raise another InvalidPattern(str(first_e))

Especially given your formatting for InvalidPattern this would become:
    InvalidPattern('Invalid pattern(s) found.')

Which would lose any sort of context that the original error tried to pass on. (So if a low level exception raised InvalidPattern('pattern XXX is broken') then you catch it and re-raise a InvalidPattern('Invalid pattern(s) found.') which is pretty unhelpful.)

5) In this test:
351 + def test_add_invalid_ignore_in_global_ignore(self):
352 + """add should handle invalid global ignore pattern gracefully.
353 + """
354 + tree = self.make_branch_and_tree('.')
355 + self.build_tree(['top.txt'])
356 + self.build_tree_contents([('foobar', 'foo\n')])
357 + ignores.add_unique_user_ignores([u"RE:*.cpp"]) # invalid pattern
358 + out, err = self.run_bzr(['add'], 3)
359 + self.assertEquals(out, '')
360 + self.assertContainsRe(err,
361 + ('error: Invalid pattern.*RE:\*\.cpp.*ERROR: '
362 + 'Invalid pattern'),
363 + re.DOTALL)
364 +

^- You make it so that 'ignores.add_unique_user_ignores()' allows an invalid pattern. It seems more logical to fault that at insert time, rather than at 'bzr add' time. (We probably still need to fail there, but we should first forbid people from adding broken entries in the first place.)

388 + def test_invalid_pattern_in_ignore_file(self):
389 + """Existing inval...

Read more...

review: Needs Fixing
Revision history for this message
Parth Malwankar (parthm) wrote :
Download full text (5.7 KiB)

> 1) It might be clearer to resubmit this, since it has changed fairly
> significantly since the beginning. (So I don't think Robert's Needs Fixing
> still directly applies.)
>
> 2)
> 61 +class InvalidPattern(BzrError):
> 62 +
> 63 + _fmt = ('Invalid pattern(s) found.')
> 64 +
> 65 + def __init__(self, message):
> 66 + self.message = message
> 67 +
>
>
> ^- You set a message, but then never display it (there is nothing in _fmt
> which shows the attributes.)
>

Good catch. Will fix it and resubmit.

> I suppose in builtins you do:
> 29 + if invalid_patterns:
> 30 + pats = '\n '.join(invalid_patterns)
> 31 + ui.ui_factory.show_warning(
> 32 + 'Skipped invalid pattern(s):\n %s' % pats)
> ^- which explicitly displays the bad patterns, but then you do so in a warning
> and not an error. Which seems to disagree with your "now raises exceptions"
> comment.
>

This is for the `bzr ignore` command where if the user gives 'bzr good1 good2 bad1 good3 bad2', good1-3 get added to .bzrignore and a warning is issued for bad1-2.

> 3) It seems very strange to me that 'show_pattern_errors' is a module-level
> function that re-reads the ignore files and reports on them.
>
> Rather than, for example, having the state of the Globster used to determine
> what patterns are invalid.
>
> It means that anyone using Globster for anything that isn't *bzr* (imagine
> qbzr using it for something) is going to get strange results. (As it will tell
> them about problems with something that isn't related at all.)
>

One reason for doing this was that the invalid pattern could come either from .bazaar/ignore of .bzrignore and I wanted to display a message indicating the bad pattern along with the specific file information. E.g.

[a1234]% ~/src/bzr.dev/300062-better-handling-for-invalid-ignore-pattern/bzr st
bzr: error: Invalid pattern(s) found in "/home/parthm/tmp/a1234/.bzrignore":
  RE:*.x
bzr: error: Invalid pattern(s) found in "/home/parthm/.bazaar/ignore":
  RE:[
bzr: ERROR: Invalid pattern(s) found.
[a1234]%

We could use the Globster state to display the message but the user would need to check both files. I would expect that the probability of an error in .bzrignore is higher but it just seems nicer to indicate the file.
Thoughts?

> 4) so for the construct here:
> 105 + except errors.InvalidPattern, e:
> 106 + # lazy_regex throws InvalidPattern in case of bad pattern.
> 107 + # For Globbing, the bad pattern may have come from an ignore
> 108 + # file, so we should any bad patterns that the ignore file(s)
> 109 + # may have.
> 110 + show_pattern_errors()
> 111 + raise errors.InvalidPattern(str(e))
>
> It seems better to me to do the formatting and put that on the
> errors.InvalidPattern that is getting raised, rather than calling
> show_pattern_errors() and having that display.
>
> I also find it strange to catch InvalidPattern and then raise another
> InvalidPattern(str(first_e))
>
> Especially given your formatting for InvalidPattern this would become:
> InvalidPattern('Invalid pattern(s) found.')
>
> Which would lose any sort of context that the original error...

Read more...

5290. By Parth Malwankar

merged changes from trunk

5291. By Parth Malwankar

removed re_compile_checked as re.compile shows InvalidPattern error.
Fixed test cases.

Revision history for this message
John A Meinel (jameinel) wrote :
Download full text (4.4 KiB)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...

>> It means that anyone using Globster for anything that isn't *bzr* (imagine
>> qbzr using it for something) is going to get strange results. (As it will tell
>> them about problems with something that isn't related at all.)
>>
>
> One reason for doing this was that the invalid pattern could come either from .bazaar/ignore of .bzrignore and I wanted to display a message indicating the bad pattern along with the specific file information. E.g.

Could we just include this information as part of the Globster from the
beginning? I understand that knowing where the pattern is bad is
helpful. I'm just concerned that it is using state after-the-fact to
debug a current failure. And anything that isn't *bzr* is likely to be
wrong here.

>
> [a1234]% ~/src/bzr.dev/300062-better-handling-for-invalid-ignore-pattern/bzr st
> bzr: error: Invalid pattern(s) found in "/home/parthm/tmp/a1234/.bzrignore":
> RE:*.x
> bzr: error: Invalid pattern(s) found in "/home/parthm/.bazaar/ignore":
> RE:[
> bzr: ERROR: Invalid pattern(s) found.
> [a1234]%
>
> We could use the Globster state to display the message but the user would need to check both files. I would expect that the probability of an error in .bzrignore is higher but it just seems nicer to indicate the file.
> Thoughts?

Pass in the name of the file associated with each pattern as part of
creating the Globster. Alternatively just give the short 'invalid
pattern found'. Better (IMO) than re-parsing the files.

...

>>
>> ^- You make it so that 'ignores.add_unique_user_ignores()' allows an invalid
>> pattern. It seems more logical to fault that at insert time, rather than at
>> 'bzr add' time. (We probably still need to fail there, but we should first
>> forbid people from adding broken entries in the first place.)
>>
>
> This is done in cmd_ignore.run. This was the "Skipped invalid pattern(s)" warning. So the bad pattern doesn't reach this point. This test case injects the bad pattern to handle the case of when the user might add the bad pattern using and external editor.
>
>

It seems better to me to make our internals robust and hard to subvert
accidentally, rather than assuming that will be done at the cmd_* layer.
Certainly qbzr or explorer is likely to end up with their own UI for
adding ignores, and it would be nice if they could also have invalid
ignores captured early rather than late. (Ideally without having to
re-implement it.)

>
>
>> ^- I would probably say that we should fail all insertions if one is invalid,
>> but we would want it to be atomic. eg if the do 'bzr ignore valid valid
>> invalid valid' either all valids get inserted [as you have today] or nothing
>> gets inserted. You don't want to add the first two, and then fail, as it
>> leaves things in an unknown state.
>>
>> I would probably tend towards 'validate early and fail', as long as we are
>> choosing that for other commands. (I would still like to see status try even
>> if it gets a bogus ignore, but I understand why Robert feels so strongly about
>> 'add' even if I probably disagree.)
>>
>> Also, it is a little bit odd that if you *have* an invalid entry it fails wi...

Read more...

Unmerged revisions

5291. By Parth Malwankar

removed re_compile_checked as re.compile shows InvalidPattern error.
Fixed test cases.

5290. By Parth Malwankar

merged changes from trunk

5289. By Parth Malwankar

better invalid pattern error handling.
Invalid patterns are now handled in a single location,
bzrlib.globbing.Globster.

5288. By Parth Malwankar

merged in changes from trunk

5287. By Parth Malwankar

fixed scope error due to lazy ui import

5286. By Parth Malwankar

filter_invalid_patterns now returns invalid and valid patterns.
dispay is left to the caller.

5285. By Parth Malwankar

added unit test for is_pattern_valid and filter_invalid_patterns

5284. By Parth Malwankar

unit tests for get_pattern_errors

5283. By Parth Malwankar

improved comments

5282. By Parth Malwankar

updated NEWS.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'NEWS'
--- NEWS 2010-06-28 21:45:10 +0000
+++ NEWS 2010-06-29 04:04:25 +0000
@@ -84,6 +84,14 @@
84 test that all commands available to the test suite have help.84 test that all commands available to the test suite have help.
85 (Robert Collins, #177500)85 (Robert Collins, #177500)
8686
87* Improved handling of invalid ignore patterns. ``ignore`` rejects invalid
88 patterns with a warning and does not add them to branch ignore file.
89 In case of an invalid pattern in ignore files, bzr commands accessing
90 the ignores fail with a clear error message indicating the invalid pattern
91 and file. Invalid regular expression will now display a bzr error
92 ``InvalidPattern`` instead of a stack trace.
93 (Parth Malwankar, #300062)
94
87* Progress output is cleaned up when exiting. (Aaron Bentley)95* Progress output is cleaned up when exiting. (Aaron Bentley)
8896
89* Raise ValueError instead of a string exception.97* Raise ValueError instead of a string exception.
9098
=== modified file 'bzrlib/builtins.py'
--- bzrlib/builtins.py 2010-06-17 08:53:15 +0000
+++ bzrlib/builtins.py 2010-06-29 04:04:25 +0000
@@ -2717,18 +2717,27 @@
2717 (len(name_pattern) > 1 and name_pattern[1] == ':')):2717 (len(name_pattern) > 1 and name_pattern[1] == ':')):
2718 raise errors.BzrCommandError(2718 raise errors.BzrCommandError(
2719 "NAME_PATTERN should not be an absolute path")2719 "NAME_PATTERN should not be an absolute path")
2720 name_pattern_list, invalid_patterns = \
2721 globbing.filter_invalid_patterns(name_pattern_list)
2722 if invalid_patterns:
2723 pats = '\n '.join(invalid_patterns)
2724 ui.ui_factory.show_warning(
2725 'Skipped invalid pattern(s):\n %s' % pats)
2726 if not name_pattern_list:
2727 # After removing invalid patterns, the list may be
2728 # empty. There is nothing to do in this case.
2729 return
2720 tree, relpath = WorkingTree.open_containing(directory)2730 tree, relpath = WorkingTree.open_containing(directory)
2721 ignores.tree_ignores_add_patterns(tree, name_pattern_list)2731 ignores.tree_ignores_add_patterns(tree, name_pattern_list)
2722 ignored = globbing.Globster(name_pattern_list)2732 ignored = globbing.Globster(name_pattern_list)
2723 matches = []2733 matches = []
2724 tree.lock_read()2734 self.add_cleanup(tree.lock_read().unlock)
2725 for entry in tree.list_files():2735 for entry in tree.list_files():
2726 id = entry[3]2736 id = entry[3]
2727 if id is not None:2737 if id is not None:
2728 filename = entry[0]2738 filename = entry[0]
2729 if ignored.match(filename):2739 if ignored.match(filename):
2730 matches.append(filename)2740 matches.append(filename)
2731 tree.unlock()
2732 if len(matches) > 0:2741 if len(matches) > 0:
2733 self.outf.write("Warning: the following files are version controlled and"2742 self.outf.write("Warning: the following files are version controlled and"
2734 " match your ignore pattern:\n%s"2743 " match your ignore pattern:\n%s"
27352744
=== modified file 'bzrlib/errors.py'
--- bzrlib/errors.py 2010-06-17 11:52:13 +0000
+++ bzrlib/errors.py 2010-06-29 04:04:25 +0000
@@ -3156,3 +3156,10 @@
3156 'E.g. bzr whoami "Your Name <name@example.com>"')3156 'E.g. bzr whoami "Your Name <name@example.com>"')
31573157
31583158
3159class InvalidPattern(BzrError):
3160
3161 _fmt = ('Invalid pattern(s) found. %(message)s')
3162
3163 def __init__(self, message):
3164 self.message = message
3165
31593166
=== modified file 'bzrlib/globbing.py'
--- bzrlib/globbing.py 2010-02-17 17:11:16 +0000
+++ bzrlib/globbing.py 2010-06-29 04:04:25 +0000
@@ -21,10 +21,19 @@
21"""21"""
2222
23import re23import re
24import errno
2425
26import bzrlib
27from bzrlib.lazy_regex import real_re_compile
25from bzrlib.trace import (28from bzrlib.trace import (
26 warning29 warning
27 )30 )
31from bzrlib import (
32 config,
33 bzrdir,
34 errors,
35 ui,
36 )
2837
2938
30class Replacer(object):39class Replacer(object):
@@ -209,10 +218,24 @@
209218
210 :return A matching pattern or None if there is no matching pattern.219 :return A matching pattern or None if there is no matching pattern.
211 """220 """
212 for regex, patterns in self._regex_patterns:221 try:
213 match = regex.match(filename)222 for regex, patterns in self._regex_patterns:
214 if match:223 match = regex.match(filename)
215 return patterns[match.lastindex -1]224 if match:
225 return patterns[match.lastindex -1]
226 except errors.InvalidPattern, e:
227 # lazy_regex throws InvalidPattern in case of bad pattern.
228 # For Globbing, the bad pattern may have come from an ignore
229 # file, so we should display any bad patterns that the ignore
230 # file(s) may have.
231 # The invalid pattern message (e.message) is specific to a
232 # pattern that re.compile failed to compile. As we show all
233 # possible pattern errors in .bazaar/ignore and .bzrignore
234 # we set the message to '' and causing InvalidPattern to show
235 # a generic error message.
236 show_pattern_errors()
237 e.message = ''
238 raise e
216 return None239 return None
217240
218class ExceptionGlobster(object):241class ExceptionGlobster(object):
@@ -283,3 +306,110 @@
283 if len(pattern) > 1:306 if len(pattern) > 1:
284 pattern = pattern.rstrip('/')307 pattern = pattern.rstrip('/')
285 return pattern308 return pattern
309
310
311def is_pattern_valid(pattern):
312 """Returns True if pattern is valid.
313
314 :param pattern: Normalized pattern.
315 is_pattern_valid() assumes pattern to be normalized.
316 see: normalize_pattern
317 """
318 result = True
319 translator = None
320 if pattern.startswith(u'RE:') or u'/' in pattern:
321 translator = _sub_fullpath
322 elif pattern.startswith(u'*.'):
323 translator = _sub_extension
324 else:
325 translator = _sub_basename
326 tpattern = '(%s)' % translator(pattern)
327 try:
328 cpattern = real_re_compile(tpattern)
329 except re.error, e:
330 result = False
331 return result
332
333
334def filter_invalid_patterns(patterns):
335 """Returns valid and invalid patterns.
336
337 :param patterns: List of normalized patterns.
338 filter_invalid_patterns() assumes pattern to be normalized.
339 see: normalize_pattern
340 """
341 valid_patterns = []
342 invalid_patterns = []
343 for pat in patterns:
344 if is_pattern_valid(pat):
345 valid_patterns.append(pat)
346 else:
347 invalid_patterns.append(pat)
348 return valid_patterns, invalid_patterns
349
350
351def get_pattern_errors():
352 """Return two tuples with information about invalid patterns.
353
354 :return: This returns two tuples in the format (path, patterns).
355 path is the full path to either the ignores file or .bzrignore.
356 patterns is a list of invalid patterns found in that file.
357
358 see also: show_pattern_errors()
359 """
360 # import ignores locally to avoid circular dependency
361 # with bzrlib.ignores. We need ignores.parse_ignore_file
362 # while bzrlib.ignores already uses bzrlib.globbing.
363 from bzrlib import ignores
364 def _get_errors(path):
365 invalid_patterns = []
366 f = None
367 try:
368 try:
369 f = open(path, 'rb')
370 patterns = ignores.parse_ignore_file(f)
371 for p in patterns:
372 if not is_pattern_valid(p):
373 invalid_patterns.append(p)
374 except (IOError, OSError), e:
375 # open() shouldn't return an IOError without errno,
376 # but just in case
377 err = getattr(e, 'errno', None)
378 if err != errno.ENOENT:
379 # Its not necessary that the user with have
380 # .bzrignore or .bazaar/ignore. If we are in
381 # _get_errors she definitely has at least one,
382 # but not necessarily both. So we ignore
383 # errno.ENOENT
384 raise e
385 finally:
386 if f:
387 f.close()
388 return invalid_patterns
389
390 wt, branch, relpath = \
391 bzrdir.BzrDir.open_containing_tree_or_branch('.')
392 bzrignore = wt.abspath(bzrlib.IGNORE_FILENAME)
393 if wt:
394 bzr_ignore_patterns = _get_errors(bzrignore)
395 else:
396 bzrignore = []
397 bzr_ignore_patterns = []
398 user_ignore = config.user_ignore_config_filename()
399 user_ignore_patterns = _get_errors(user_ignore)
400 return (bzrignore, bzr_ignore_patterns), (user_ignore, user_ignore_patterns)
401
402
403def show_pattern_errors():
404 """Displays a warning about invalid patterns found.
405
406 Display warning(s) with the list of invalid patterns
407 found in ignores file or .bzrignore.
408 """
409 invalid_pattern_info = get_pattern_errors()
410 for path, invalid_patterns in invalid_pattern_info:
411 if invalid_patterns:
412 pattern_str = '\n '.join(invalid_patterns)
413 ui.ui_factory.show_error('Invalid pattern(s) found in "%s":\n %s'
414 % (path, pattern_str))
415
286416
=== modified file 'bzrlib/ignores.py'
--- bzrlib/ignores.py 2010-05-03 04:51:58 +0000
+++ bzrlib/ignores.py 2010-06-29 04:04:25 +0000
@@ -44,10 +44,9 @@
44]44]
4545
4646
47
48def parse_ignore_file(f):47def parse_ignore_file(f):
49 """Read in all of the lines in the file and turn it into an ignore list48 """Read in all of the lines in the file and turn it into an ignore list
50 49
51 Continue in the case of utf8 decoding errors, and emit a warning when 50 Continue in the case of utf8 decoding errors, and emit a warning when
52 such and error is found. Optimise for the common case -- no decoding 51 such and error is found. Optimise for the common case -- no decoding
53 errors.52 errors.
@@ -176,7 +175,6 @@
176 """Get the current set of runtime ignores."""175 """Get the current set of runtime ignores."""
177 return _runtime_ignores176 return _runtime_ignores
178177
179
180def tree_ignores_add_patterns(tree, name_pattern_list):178def tree_ignores_add_patterns(tree, name_pattern_list):
181 """Add more ignore patterns to the ignore file in a tree.179 """Add more ignore patterns to the ignore file in a tree.
182 If ignore file does not exist then it will be created.180 If ignore file does not exist then it will be created.
@@ -227,3 +225,4 @@
227225
228 if not tree.path2id(bzrlib.IGNORE_FILENAME):226 if not tree.path2id(bzrlib.IGNORE_FILENAME):
229 tree.add([bzrlib.IGNORE_FILENAME])227 tree.add([bzrlib.IGNORE_FILENAME])
228
230229
=== modified file 'bzrlib/lazy_regex.py'
--- bzrlib/lazy_regex.py 2009-03-23 14:59:43 +0000
+++ bzrlib/lazy_regex.py 2010-06-29 04:04:25 +0000
@@ -22,6 +22,8 @@
2222
23import re23import re
2424
25from bzrlib import errors
26
2527
26class LazyRegex(object):28class LazyRegex(object):
27 """A proxy around a real regex, which won't be compiled until accessed."""29 """A proxy around a real regex, which won't be compiled until accessed."""
@@ -58,7 +60,12 @@
5860
59 def _real_re_compile(self, *args, **kwargs):61 def _real_re_compile(self, *args, **kwargs):
60 """Thunk over to the original re.compile"""62 """Thunk over to the original re.compile"""
61 return _real_re_compile(*args, **kwargs)63 try:
64 return real_re_compile(*args, **kwargs)
65 except re.error, e:
66 # raise InvalidPattern instead of re.error as this gives a
67 # cleaner message to the user.
68 raise errors.InvalidPattern('"' + args[0] + '" ' +str(e))
6269
63 def __getattr__(self, attr):70 def __getattr__(self, attr):
64 """Return a member from the proxied regex object.71 """Return a member from the proxied regex object.
@@ -97,11 +104,11 @@
97 Though the first call will reset back to the original (it doesn't104 Though the first call will reset back to the original (it doesn't
98 track nesting level)105 track nesting level)
99 """106 """
100 re.compile = _real_re_compile107 re.compile = real_re_compile
101108
102109
103_real_re_compile = re.compile110real_re_compile = re.compile
104if _real_re_compile is lazy_compile:111if real_re_compile is lazy_compile:
105 raise AssertionError(112 raise AssertionError(
106 "re.compile has already been overridden as lazy_compile, but this would" \113 "re.compile has already been overridden as lazy_compile, but this would" \
107 " cause infinite recursion")114 " cause infinite recursion")
108115
=== modified file 'bzrlib/log.py'
--- bzrlib/log.py 2010-06-17 08:53:15 +0000
+++ bzrlib/log.py 2010-06-29 04:04:25 +0000
@@ -86,7 +86,6 @@
86 format_date,86 format_date,
87 format_date_with_offset_in_original_timezone,87 format_date_with_offset_in_original_timezone,
88 get_terminal_encoding,88 get_terminal_encoding,
89 re_compile_checked,
90 terminal_width,89 terminal_width,
91 )90 )
92from bzrlib.symbol_versioning import (91from bzrlib.symbol_versioning import (
@@ -811,8 +810,7 @@
811 """810 """
812 if search is None:811 if search is None:
813 return log_rev_iterator812 return log_rev_iterator
814 searchRE = re_compile_checked(search, re.IGNORECASE,813 searchRE = re.compile(search, re.IGNORECASE)
815 'log message filter')
816 return _filter_message_re(searchRE, log_rev_iterator)814 return _filter_message_re(searchRE, log_rev_iterator)
817815
818816
819817
=== modified file 'bzrlib/osutils.py'
--- bzrlib/osutils.py 2010-06-25 20:34:05 +0000
+++ bzrlib/osutils.py 2010-06-29 04:04:25 +0000
@@ -2169,29 +2169,6 @@
2169 raise2169 raise
21702170
21712171
2172def re_compile_checked(re_string, flags=0, where=""):
2173 """Return a compiled re, or raise a sensible error.
2174
2175 This should only be used when compiling user-supplied REs.
2176
2177 :param re_string: Text form of regular expression.
2178 :param flags: eg re.IGNORECASE
2179 :param where: Message explaining to the user the context where
2180 it occurred, eg 'log search filter'.
2181 """
2182 # from https://bugs.launchpad.net/bzr/+bug/251352
2183 try:
2184 re_obj = re.compile(re_string, flags)
2185 re_obj.search("")
2186 return re_obj
2187 except re.error, e:
2188 if where:
2189 where = ' in ' + where
2190 # despite the name 'error' is a type
2191 raise errors.BzrCommandError('Invalid regular expression%s: %r: %s'
2192 % (where, re_string, e))
2193
2194
2195if sys.platform == "win32":2172if sys.platform == "win32":
2196 import msvcrt2173 import msvcrt
2197 def getchar():2174 def getchar():
21982175
=== modified file 'bzrlib/tests/__init__.py'
--- bzrlib/tests/__init__.py 2010-06-25 06:12:56 +0000
+++ bzrlib/tests/__init__.py 2010-06-29 04:04:25 +0000
@@ -2742,8 +2742,7 @@
2742 :param pattern: A regular expression string.2742 :param pattern: A regular expression string.
2743 :return: A callable that returns True if the re matches.2743 :return: A callable that returns True if the re matches.
2744 """2744 """
2745 filter_re = osutils.re_compile_checked(pattern, 0,2745 filter_re = re.compile(pattern, 0)
2746 'test filter')
2747 def condition(test):2746 def condition(test):
2748 test_id = test.id()2747 test_id = test.id()
2749 return filter_re.search(test_id)2748 return filter_re.search(test_id)
27502749
=== modified file 'bzrlib/tests/blackbox/test_add.py'
--- bzrlib/tests/blackbox/test_add.py 2010-02-23 07:43:11 +0000
+++ bzrlib/tests/blackbox/test_add.py 2010-06-29 04:04:25 +0000
@@ -18,8 +18,10 @@
18"""Tests of the 'bzr add' command."""18"""Tests of the 'bzr add' command."""
1919
20import os20import os
21import re
2122
22from bzrlib import (23from bzrlib import (
24 ignores,
23 osutils,25 osutils,
24 tests,26 tests,
25 )27 )
@@ -217,3 +219,31 @@
217 os.symlink(osutils.abspath('target'), 'tree/link')219 os.symlink(osutils.abspath('target'), 'tree/link')
218 out = self.run_bzr(['add', 'tree/link'])[0]220 out = self.run_bzr(['add', 'tree/link'])[0]
219 self.assertEquals(out, 'adding link\n')221 self.assertEquals(out, 'adding link\n')
222
223 def test_add_invalid_ignore(self):
224 """add should handle invalid ignore pattern gracefully.
225 """
226 tree = self.make_branch_and_tree('.')
227 self.build_tree(['top.txt'])
228 self.build_tree_contents([('.bzrignore', 'RE:*.cpp\n')])
229 out, err = self.run_bzr(['add'], 3)
230 self.assertEquals(out, '')
231 self.assertContainsRe(err,
232 ('error: Invalid pattern.*RE:\*\.cpp.*ERROR: '
233 'Invalid pattern'),
234 re.DOTALL)
235
236 def test_add_invalid_ignore_in_global_ignore(self):
237 """add should handle invalid global ignore pattern gracefully.
238 """
239 tree = self.make_branch_and_tree('.')
240 self.build_tree(['top.txt'])
241 self.build_tree_contents([('foobar', 'foo\n')])
242 ignores.add_unique_user_ignores([u"RE:*.cpp"]) # invalid pattern
243 out, err = self.run_bzr(['add'], 3)
244 self.assertEquals(out, '')
245 self.assertContainsRe(err,
246 ('error: Invalid pattern.*RE:\*\.cpp.*ERROR: '
247 'Invalid pattern'),
248 re.DOTALL)
249
220250
=== modified file 'bzrlib/tests/blackbox/test_ignore.py'
--- bzrlib/tests/blackbox/test_ignore.py 2010-06-11 07:32:12 +0000
+++ bzrlib/tests/blackbox/test_ignore.py 2010-06-29 04:04:25 +0000
@@ -162,3 +162,36 @@
162 tree = self.make_branch_and_tree('a')162 tree = self.make_branch_and_tree('a')
163 self.run_bzr(['ignore', '--directory=a', 'README'])163 self.run_bzr(['ignore', '--directory=a', 'README'])
164 self.check_file_contents('a/.bzrignore', 'README\n')164 self.check_file_contents('a/.bzrignore', 'README\n')
165
166 def test_ignored_invalid_pattern(self):
167 """Ensure graceful handling for invalid ignore pattern.
168
169 Test case for #300062.
170 Invalid pattern should show clear error message.
171 Invalid pattern should not be added to .bzrignore file.
172 """
173 tree = self.make_branch_and_tree('.')
174 out, err = self.run_bzr(['ignore', 'RE:*.cpp'])
175 self.assertEqual(out, '')
176 self.assertContainsRe(err,
177 'bzr: warning: Skipped invalid pattern.*RE:\*\.cpp', re.DOTALL)
178 self.assertFalse(os.path.isfile('.bzrignore'))
179
180 def test_invalid_pattern_in_ignore_file(self):
181 """Existing invalid patterns should be shown.
182 """
183 tree = self.make_branch_and_tree('.')
184 self.build_tree(['file'])
185 self.build_tree_contents([('.bzrignore', 'RE:[\n')])
186 out, err = self.run_bzr(['ignore', 'RE:*.cpp', 'abc'], 3)
187 self.assertEqual(out, '')
188 self.assertContainsRe(err, 'warning: Skipped invalid pattern'
189 '.*RE:\*\.cpp.*'
190 'error: Invalid pattern.*RE:\[', re.DOTALL)
191 f = open('.bzrignore', 'r')
192 try:
193 lines = f.readlines()
194 self.assertTrue('abc\n' in lines)
195 finally:
196 f.close()
197
165198
=== modified file 'bzrlib/tests/blackbox/test_ignored.py'
--- bzrlib/tests/blackbox/test_ignored.py 2010-06-11 07:32:12 +0000
+++ bzrlib/tests/blackbox/test_ignored.py 2010-06-29 04:04:25 +0000
@@ -17,6 +17,8 @@
1717
18"""Tests of the 'bzr ignored' command."""18"""Tests of the 'bzr ignored' command."""
1919
20import re
21
20from bzrlib.tests import TestCaseWithTransport22from bzrlib.tests import TestCaseWithTransport
2123
2224
@@ -46,3 +48,15 @@
46 ('a/.bzrignore', 'README')])48 ('a/.bzrignore', 'README')])
47 out, err = self.run_bzr(['ignored', '--directory=a'])49 out, err = self.run_bzr(['ignored', '--directory=a'])
48 self.assertStartsWith(out, 'README')50 self.assertStartsWith(out, 'README')
51
52 def test_invalid_pattern_in_ignore_file(self):
53 """Existing invalid patterns should be shown.
54 """
55 tree = self.make_branch_and_tree('.')
56 self.build_tree(['file'])
57 self.build_tree_contents([('.bzrignore', 'RE:[\n')])
58 out, err = self.run_bzr(['ignored'], 3)
59 self.assertEqual(out, '')
60 self.assertContainsRe(err, 'error: Invalid pattern.*RE:\[.*ERROR:',
61 re.DOTALL)
62
4963
=== modified file 'bzrlib/tests/blackbox/test_log.py'
--- bzrlib/tests/blackbox/test_log.py 2010-06-25 08:01:24 +0000
+++ bzrlib/tests/blackbox/test_log.py 2010-06-29 04:04:25 +0000
@@ -386,11 +386,8 @@
386 """386 """
387 self.make_minimal_branch()387 self.make_minimal_branch()
388 out, err = self.run_bzr(['log', '-m', '*'], retcode=3)388 out, err = self.run_bzr(['log', '-m', '*'], retcode=3)
389 self.assertEqual("bzr: ERROR: Invalid regular expression"389 self.assertContainsRe(err, "ERROR.*Invalid pattern.*nothing to repeat")
390 " in log message filter"390 self.assertEqual(out, '')
391 ": '*'"
392 ": nothing to repeat\n", err)
393 self.assertEqual('', out)
394391
395 def test_log_unsupported_timezone(self):392 def test_log_unsupported_timezone(self):
396 self.make_linear_branch()393 self.make_linear_branch()
397394
=== modified file 'bzrlib/tests/blackbox/test_status.py'
--- bzrlib/tests/blackbox/test_status.py 2010-04-07 21:34:13 +0000
+++ bzrlib/tests/blackbox/test_status.py 2010-06-29 04:04:25 +0000
@@ -24,12 +24,14 @@
2424
25from cStringIO import StringIO25from cStringIO import StringIO
26import codecs26import codecs
27import re
27from os import mkdir, chdir, rmdir, unlink28from os import mkdir, chdir, rmdir, unlink
28import sys29import sys
29from tempfile import TemporaryFile30from tempfile import TemporaryFile
3031
31from bzrlib import (32from bzrlib import (
32 bzrdir,33 bzrdir,
34 ignores,
33 conflicts,35 conflicts,
34 errors,36 errors,
35 osutils,37 osutils,
@@ -701,6 +703,57 @@
701 out, err = self.run_bzr('status tree/a')703 out, err = self.run_bzr('status tree/a')
702 self.assertNotContainsRe(out, 'pending merge')704 self.assertNotContainsRe(out, 'pending merge')
703705
706 def test_status_invalid_ignore_pattern(self):
707 """Tests status with invalid ignore pattern"""
708 wt = self.make_branch_and_tree('.')
709 self.build_tree_contents([('.bzrignore', 'RE:*.cpp\n')])
710 wt.commit('commit .bzrignore')
711 out, err = self.run_bzr(['status'], 3)
712 self.assertEquals(out, '')
713 self.assertContainsRe(err, 'error: Invalid pattern.*RE:\*\.cpp.*ERROR',
714 flags=re.DOTALL)
715
716 def test_status_invalid_ignore_pattern_in_global_ignore(self):
717 """Tests status with invalid ignore pattern in .bazaar/ignore"""
718 wt = self.make_branch_and_tree('.')
719 self.build_tree(['file'])
720 wt.add('file')
721 wt.commit('one')
722 # add unknown file
723 f = open('foobar', 'w')
724 try:
725 f.write(u"foo\n") # invalid pattern
726 finally:
727 f.close()
728 ignores.add_unique_user_ignores([u"RE:*.cpp"]) # invalid pattern
729 out, err = self.run_bzr(['status'], 3)
730 self.assertEquals(out, '')
731 self.assertContainsRe(err, 'error: Invalid pattern.*RE:\*\.cpp.*ERROR',
732 flags=re.DOTALL)
733
734 def test_short_status_invalid_ignore_pattern(self):
735 """Tests short status with invalid ignore pattern"""
736 wt = self.make_branch_and_tree('.')
737 self.build_tree_contents([('.bzrignore', 'RE:*.cpp\n')])
738 wt.commit('commit .bzrignore')
739 out, err = self.run_bzr(['status', '-S'], 3)
740 self.assertEquals(out, '')
741 self.assertContainsRe(err, 'error: Invalid pattern.*RE:\*\.cpp.*ERROR',
742 flags=re.DOTALL)
743
744 def test_specific_status_invalid_ignore_pattern(self):
745 """Tests specific status with invalid ignore pattern"""
746 wt = self.make_branch_and_tree('.')
747 self.build_tree(['file'])
748 wt.add('file')
749 wt.commit('added file')
750 self.build_tree_contents([('.bzrignore', 'RE:*.cpp\n')])
751 wt.commit('commit .bzrignore')
752 out, err = self.run_bzr(['status', 'file'], 3)
753 self.assertEquals(out, '')
754 self.assertContainsRe(err, 'error: Invalid pattern.*RE:\*\.cpp.*ERROR',
755 flags=re.DOTALL)
756
704757
705class TestStatusEncodings(TestCaseWithTransport):758class TestStatusEncodings(TestCaseWithTransport):
706759
707760
=== modified file 'bzrlib/tests/test_globbing.py'
--- bzrlib/tests/test_globbing.py 2010-02-17 17:11:16 +0000
+++ bzrlib/tests/test_globbing.py 2010-06-29 04:04:25 +0000
@@ -15,15 +15,20 @@
15# along with this program; if not, write to the Free Software15# along with this program; if not, write to the Free Software
16# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA16# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
1717
18from bzrlib import config, ignores
18from bzrlib.globbing import (19from bzrlib.globbing import (
19 Globster,20 Globster,
20 ExceptionGlobster,21 ExceptionGlobster,
22 filter_invalid_patterns,
23 get_pattern_errors,
24 is_pattern_valid,
21 _OrderedGlobster,25 _OrderedGlobster,
22 normalize_pattern26 normalize_pattern
23 )27 )
24from bzrlib.tests import (28from bzrlib.tests import (
25 TestCase,29 TestCase,
26 TestCaseInTempDir,30 TestCaseInTempDir,
31 TestCaseWithTransport,
27 )32 )
2833
2934
@@ -371,3 +376,48 @@
371 """tests that multiple mixed slashes are collapsed to single forward376 """tests that multiple mixed slashes are collapsed to single forward
372 slashes and trailing mixed slashes are removed"""377 slashes and trailing mixed slashes are removed"""
373 self.assertEqual(u'/foo/bar', normalize_pattern(u'\\/\\foo//\\///bar/\\\\/'))378 self.assertEqual(u'/foo/bar', normalize_pattern(u'\\/\\foo//\\///bar/\\\\/'))
379
380class TestInvalidPattern(TestCaseWithTransport):
381
382 def test_is_pattern_valid(self):
383 """Ensure is_pattern_valid() detects invalid patterns.
384 """
385 self.assertTrue(is_pattern_valid(normalize_pattern(u'foo')))
386 self.assertFalse(is_pattern_valid(normalize_pattern(u'RE:*.cpp')))
387
388 def test_filter_invalid_patterns(self):
389 """Ensure filter_invalid_patterns() returns only valid patterns.
390 """
391 patterns = [u'abc', u'RE:*.cpp', u'[', u'def']
392 patterns = [normalize_pattern(p) for p in patterns]
393 ref_valid_patterns = set([u'abc', u'def'])
394 ref_invalid_patterns = set([u'RE:*.cpp', u'['])
395 valid_patterns, invalid_patterns = filter_invalid_patterns(patterns)
396 self.assertEqual(ref_valid_patterns, set(valid_patterns))
397 self.assertEqual(ref_invalid_patterns, set(invalid_patterns))
398
399 def test_invalid_ignores(self):
400 """Ensure that invalid user ignores are detected.
401 """
402 ignore_path = config.user_ignore_config_filename()
403 user_ignores = ignores.get_user_ignores()
404 f = open(ignore_path, 'wb')
405 try:
406 f.write(u'foo\nRE:*.cpp\nbar\nRE:[\nbaz')
407 finally:
408 f.close()
409 (bi, bip), (ui, uip) = get_pattern_errors()
410 self.assertEquals(set(uip), set([u'RE:*.cpp', u'RE:[']))
411 self.assertEquals(bip, [])
412
413 def test_invalid_bzrignore(self):
414 """Ensure that invalid .bzrignore entries are detected.
415 """
416 tree = self.make_branch_and_tree(".")
417 self.build_tree_contents([('.bzrignore',
418 "foo\nRE:*.cpp\nbar\nRE:[\nbaz\n")])
419 tree.add([".bzrignore"])
420 (bi, bip), (ui, uip) = get_pattern_errors()
421 self.assertEquals(set(bip), set([u'RE:*.cpp', u'RE:[']))
422 self.assertEquals(uip, [])
423
374424
=== modified file 'bzrlib/tests/test_ignores.py'
--- bzrlib/tests/test_ignores.py 2010-05-03 04:51:58 +0000
+++ bzrlib/tests/test_ignores.py 2010-06-29 04:04:25 +0000
@@ -217,3 +217,4 @@
217 ignores.tree_ignores_add_patterns(tree, ["myentry2", "foo"])217 ignores.tree_ignores_add_patterns(tree, ["myentry2", "foo"])
218 self.assertEquals(open('.bzrignore', 'rb').read(), 'myentry1\r\nmyentry2\r\nfoo\r\n')218 self.assertEquals(open('.bzrignore', 'rb').read(), 'myentry1\r\nmyentry2\r\nfoo\r\n')
219 self.assertPatternsEquals(["myentry1", "myentry2", "foo"])219 self.assertPatternsEquals(["myentry1", "myentry2", "foo"])
220
220221
=== modified file 'bzrlib/tests/test_osutils.py'
--- bzrlib/tests/test_osutils.py 2010-06-08 01:45:09 +0000
+++ bzrlib/tests/test_osutils.py 2010-06-29 04:04:25 +0000
@@ -1703,24 +1703,6 @@
1703 self.assertRaises(IOError, osutils.resource_string, 'bzrlib', 'yyy.xx')1703 self.assertRaises(IOError, osutils.resource_string, 'bzrlib', 'yyy.xx')
17041704
17051705
1706class TestReCompile(tests.TestCase):
1707
1708 def test_re_compile_checked(self):
1709 r = osutils.re_compile_checked(r'A*', re.IGNORECASE)
1710 self.assertTrue(r.match('aaaa'))
1711 self.assertTrue(r.match('aAaA'))
1712
1713 def test_re_compile_checked_error(self):
1714 # like https://bugs.launchpad.net/bzr/+bug/251352
1715 err = self.assertRaises(
1716 errors.BzrCommandError,
1717 osutils.re_compile_checked, '*', re.IGNORECASE, 'test case')
1718 self.assertEqual(
1719 "Invalid regular expression in test case: '*': "
1720 "nothing to repeat",
1721 str(err))
1722
1723
1724class TestDirReader(tests.TestCaseInTempDir):1706class TestDirReader(tests.TestCaseInTempDir):
17251707
1726 # Set by load_tests1708 # Set by load_tests