Merge lp:~parthm/bzr/503670-vila-grep-as-builtin into lp:bzr

Proposed by Parth Malwankar
Status: Rejected
Rejected by: Vincent Ladeuil
Proposed branch: lp:~parthm/bzr/503670-vila-grep-as-builtin
Merge into: lp:bzr
Diff against target: 216 lines (+183/-0)
3 files modified
bzrlib/builtins.py (+127/-0)
bzrlib/tests/blackbox/__init__.py (+1/-0)
bzrlib/tests/blackbox/test_grep.py (+55/-0)
To merge this branch: bzr merge lp:~parthm/bzr/503670-vila-grep-as-builtin
Reviewer Review Type Date Requested Status
Parth Malwankar Disapprove
Martin Packman (community) Needs Fixing
Robert Collins (community) Needs Fixing
Review via email: mp+19857@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Parth Malwankar (parthm) wrote :

Bug #503670

This patch ports the Vincent's grep plugin https://code.launchpad.net/~vila/bzr/grep
to builtins. The grep command is now available as 'bzr grep PATTERN [-- options]'. All grep options are support (followed by '--').
The current implementation requires xargs and grep commands to be available.

This command uses an implementation of os.path.relpath (ported from python 2.6 as cmd_grep._relpath). I tried using osutils.relpath but I found the two to be
inconsistent with the python 2.6 version in argument ordering, having optional arg and handling of '.'. As I don't fully understand the use cases of osutils.relpath I decided to keep cmd_grep._relpath.

Note on windows usage.
I tried this on a windows system with http://unxutils.sourceforge.net/ installed.
The command silently existed for me even though xargs and grep are available. It may be a good idea to disable this command on non-POSIX OSes. We could add a new error type. Comments/suggestions?

Example usage:
[bzrlib]% ../bzr --no-plugins grep main_version -- -i
__init__.py: main_version = '%d.%d' % version_info[:2]
__init__.py: main_version = '%d.%d.%d' % version_info[:3]
__init__.py: return main_version
__init__.py: return main_version + sub_string
tests/blackbox/test_version.py: def test_main_version(self):
[bzrlib]%

Revision history for this message
Robert Collins (lifeless) wrote :

 review: needsfixing

So this seems to have very few tests and not work on windows. I don't
think its good enough to merge to core yet.

Also as this doesn't look in history at all, there is really little
benefit in putting it in the core unless its going to make windows users
lives easier (which it won't). Right now its little more than 'bzr ls |
xargs greap'.

I rather suspect you'd be better of writing one from scratch with full
tests at appropriate unit and blackbox layers, to get windows support
and test coverage.

-Rob

review: Needs Fixing
Revision history for this message
Martin Packman (gz) wrote :

As a windows user, I have no problem with the idea of a `bzr grep` that depends on an external grep tool, I'd just want you to lose the xargs dependency.

The -- thing is ugly, why not just pull out the bazaar-specific arguments from the list given and pass the rest onto the external tool?

review: Needs Fixing
Revision history for this message
Parth Malwankar (parthm) wrote :

> As a windows user, I have no problem with the idea of a `bzr grep` that
> depends on an external grep tool, I'd just want you to lose the xargs
> dependency.
>

Having grep.exe shipped with bzr is one option. But for something like history
and view support, we would still need to do use bzrlib API to filter files,
get text of older versions and pipe it to grep. And as you mentioned below, I also
think '--' looks quite ugly so it might be good to process options via bzr's
command line processing.

I feel going with 'bzr ls [opts] | grep' may be a better option than the above. For history we would probably need to use 'bzr cat'?

Considering the above, IMO going with a plugin gives us more flexibility (e.g the --from-root option in the plugin) along with the possibility of easily enhancing it in the future.

> The -- thing is ugly, why not just pull out the bazaar-specific arguments from
> the list given and pass the rest onto the external tool?

I am wondering if there is a consensus on the approach to bug #503670.
The options which seem ok to me are:
 (1) 'bzr ls [opts] | grep': bzr could [based on user selection] install grep.exe on windows. 'bzr cat' could be used for history search.
 (2) self contained grep [a]: This should probably be a plugin (at least to start with).
 (3) ???
Whatever option is chosen should have support for views and history IMO.

Once there is some agreement on the approach thats preferred, we should be
able to close bug #503670

[a] https://code.launchpad.net/~parthm/bzr/503670-grep-plugin/+merge/20067

Revision history for this message
Martin Pool (mbp) wrote :

I think it's reasonable to have a selfcontained grep, so that
 * it can efficiently search history
 * it's easier for windows users

Building on Python's re module a basic grep should be very small.

I don't strongly care if this is in a plugin or core. I don't think
core needs to be minimal and I don't think it needs to be maximal and
merge every useful feature.

It would be nice to also have the option to pipe files through an
external grep so people can get extra features from grep.
--
Martin <http://launchpad.net/~mbp/>

Revision history for this message
Parth Malwankar (parthm) wrote :

> It would be nice to also have the option to pipe files through an
> external grep so people can get extra features from grep.
> --
> Martin <http://launchpad.net/~mbp/>

Sounds like a good idea. '--cat-only'.

Revision history for this message
Martin Pool (mbp) wrote :

On 26 February 2010 16:54, Parth Malwankar <email address hidden> wrote:
>
>
>> It would be nice to also have the option to pipe files through an
>> external grep so people can get extra features from grep.
>> --
>> Martin <http://launchpad.net/~mbp/>
>
> Sounds like a good idea. '--cat-only'.

Well, actually I meant --external-grep=PROG, and then that can be run
for each file, using eg gnu grep's --label argument to indicate which
version of which file it is.

--
Martin <http://launchpad.net/~mbp/>

Revision history for this message
Parth Malwankar (parthm) wrote :

This proposal should probably be rejected over lp:bzr-grep.

review: Disapprove

Unmerged revisions

5054. By Parth Malwankar

ported tests

5053. By Parth Malwankar

move grep command into core (from plugin https://code.launchpad.net/~vila/bzr/grep)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'bzrlib/builtins.py'
--- bzrlib/builtins.py 2010-02-15 20:23:34 +0000
+++ bzrlib/builtins.py 2010-02-22 11:16:18 +0000
@@ -24,6 +24,7 @@
24import cStringIO24import cStringIO
25import sys25import sys
26import time26import time
27import subprocess
2728
28import bzrlib29import bzrlib
29from bzrlib import (30from bzrlib import (
@@ -5916,6 +5917,132 @@
5916 for path, location in sorted(ref_list):5917 for path, location in sorted(ref_list):
5917 self.outf.write('%s %s\n' % (path, location))5918 self.outf.write('%s %s\n' % (path, location))
59185919
5920class cmd_grep(Command):
5921 """Grep through your working tree, but only files managed by bzr.
5922
5923 By default this is a recursive grep, unless you specify --non-recursive.
5924
5925 To pass options to grep use "--" to signal the end of options to bzr, eg:
5926
5927 # bzr grep pattern -- -i file
5928
5929 This does a case-insensitive grep on file.
5930 """
5931
5932 takes_args = ['pattern', 'file*']
5933 takes_options = [
5934 Option(
5935 'non-recursive',
5936 help="Don't recursively grep the contents of directories."),
5937 ]
5938
5939 # TODO: Separate file selection from file processing
5940 # TODO: avoid xargs completely ?
5941 # TODO: separate file selection from file processing to get a true streamed
5942 # process (everything is is buffered so far, inputs as well as outputs and will
5943 # blow up for big trees one day or the other)
5944 @display_command
5945 def run(self, pattern, non_recursive=False, file_list=None):
5946
5947 curdir = osutils.getcwd()
5948
5949 options, file_list = self._read_options(file_list)
5950
5951 versioned = 'V'
5952
5953 file_list = tree_files(file_list)[1]
5954 tree, b, reldir = bzrdir.BzrDir.open_containing_tree_or_branch('.')
5955 if reldir:
5956 reldir += '/'
5957 tree.lock_read()
5958
5959 try:
5960 cmd = ['xargs', '-0', 'grep', '-e', pattern]
5961 cmd.extend(options)
5962
5963 xargs = subprocess.Popen(cmd, stdin=subprocess.PIPE,
5964 stdout=subprocess.PIPE,
5965 stderr=subprocess.PIPE,
5966 )
5967
5968 for fpath, fclass, fkind, fid, entry \
5969 in tree.list_files(include_root=False):
5970 # XXX: this selection part is messy, more tests are needed and
5971 # a better flow too
5972 if fkind == 'directory':
5973 continue
5974 if file_list:
5975 if non_recursive:
5976 if not fpath in file_list:
5977 continue
5978 elif not osutils.is_inside_any(file_list, fpath):
5979 continue
5980 elif reldir and not fpath.startswith(reldir):
5981 continue
5982 if (non_recursive and reldir
5983 and fpath.startswith(reldir)
5984 and '/' in fpath[len(reldir):]):
5985 continue
5986 if fclass != versioned:
5987 continue
5988 rpath = self._relpath(tree.abspath(fpath))
5989 xargs.stdin.write(rpath + '\0')
5990 xargs.stdin.flush()
5991 out, err = xargs.communicate()
5992 if out:
5993 sys.stdout.write(out)
5994 if err:
5995 sys.stderr.write(err)
5996 finally:
5997 tree.unlock()
5998
5999 def _read_options(self, list):
6000 # This is a bit of a hack, but if we scan the file list for
6001 # options we can do:
6002 # bzr grep foo -- -i file
6003 # and pass the -i through to grep.
6004
6005 if list is None:
6006 return [], []
6007
6008 files = []
6009 options = []
6010
6011 while len(list) > 0:
6012 item = list.pop(0)
6013
6014 if item.startswith('-'):
6015 options.append(item)
6016 if item in ['-A', '-B', '-C', '-m']:
6017 options.append(list.pop(0))
6018 else:
6019 files.append(item)
6020
6021 return options, files
6022
6023 # Borrowed from python2.6 until we get there
6024 def _relpath(self, path, start=os.path.curdir):
6025 """Return a relative version of a path.
6026
6027 This a slighty simplified and modified version of the 2.6
6028 version. Windows specific error handling deleted since we always calls
6029 this function for paths on the same volume.
6030 """
6031
6032 if not path:
6033 raise ValueError("no path specified")
6034
6035 start_list = os.path.abspath(start).split(os.path.sep)
6036 path_list = os.path.abspath(path).split(os.path.sep)
6037
6038 # Work out how much of the filepath is shared by start and path.
6039 i = len(os.path.commonprefix([start_list, path_list]))
6040
6041 rel_list = [os.path.pardir] * (len(start_list)-i) + path_list[i:]
6042 return os.path.join(*rel_list)
6043
6044
6045
59196046
5920# these get imported and then picked up by the scan for cmd_*6047# these get imported and then picked up by the scan for cmd_*
5921# TODO: Some more consistent way to split command definitions across files;6048# TODO: Some more consistent way to split command definitions across files;
59226049
=== modified file 'bzrlib/tests/blackbox/__init__.py'
--- bzrlib/tests/blackbox/__init__.py 2009-04-29 20:31:34 +0000
+++ bzrlib/tests/blackbox/__init__.py 2010-02-22 11:16:18 +0000
@@ -60,6 +60,7 @@
60 'bzrlib.tests.blackbox.test_filesystem_cicp',60 'bzrlib.tests.blackbox.test_filesystem_cicp',
61 'bzrlib.tests.blackbox.test_filtered_view_ops',61 'bzrlib.tests.blackbox.test_filtered_view_ops',
62 'bzrlib.tests.blackbox.test_find_merge_base',62 'bzrlib.tests.blackbox.test_find_merge_base',
63 'bzrlib.tests.blackbox.test_grep',
63 'bzrlib.tests.blackbox.test_help',64 'bzrlib.tests.blackbox.test_help',
64 'bzrlib.tests.blackbox.test_hooks',65 'bzrlib.tests.blackbox.test_hooks',
65 'bzrlib.tests.blackbox.test_ignore',66 'bzrlib.tests.blackbox.test_ignore',
6667
=== added file 'bzrlib/tests/blackbox/test_grep.py'
--- bzrlib/tests/blackbox/test_grep.py 1970-01-01 00:00:00 +0000
+++ bzrlib/tests/blackbox/test_grep.py 2010-02-22 11:16:18 +0000
@@ -0,0 +1,55 @@
1# Copyright (C) 2010 Canonical Ltd
2#
3# This program is free software; you can redistribute it and/or modify
4# it under the terms of the GNU General Public License as published by
5# the Free Software Foundation; either version 2 of the License, or
6# (at your option) any later version.
7#
8# This program is distributed in the hope that it will be useful,
9# but WITHOUT ANY WARRANTY; without even the implied warranty of
10# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
11# GNU General Public License for more details.
12#
13# You should have received a copy of the GNU General Public License
14# along with this program; if not, write to the Free Software
15# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
16
17import os
18
19from bzrlib.tests import TestCaseWithTransport
20
21class TestGrepInDirs(TestCaseWithTransport):
22
23 def setUp(self):
24 super(TestGrepInDirs, self).setUp()
25 tree = self.make_branch_and_tree('.')
26 file_list = ['top', 'subdir/', 'subdir/down']
27 self.build_tree(file_list)
28 tree.add(file_list)
29 tree.commit('setup')
30
31 def assertGrepResultIs(self, expected_output, grep_args,
32 working_dir='.',
33 expected_error_re='.*'):
34 out, err = self.run_bzr(['grep'] + grep_args, working_dir=working_dir)
35 self.assertEquals(expected_output, out)
36 self.assertContainsRe(err, expected_error_re)
37
38 def test_grep_in_root_dir(self):
39 self.assertGrepResultIs('subdir/down:contents of subdir/down\n'
40 'top:contents of top\n' ,
41 ['contents'])
42 self.assertGrepResultIs('contents of top\n', ['top', 'top'])
43 self.assertGrepResultIs('contents of subdir/down\n', ['down',
44 'subdir/down'])
45
46 def test_grep_in_subir(self):
47 self.assertGrepResultIs('contents of subdir/down\n', ['down', 'down'],
48 working_dir='subdir')
49 self.assertGrepResultIs('contents of top\n', ['top', '../top'],
50 working_dir='subdir')
51
52 def test_grep_in_subdir_doesnt_go_up(self):
53 self.assertGrepResultIs('contents of subdir/down\n', ['contents'],
54 working_dir='subdir')
55