Merge lp:~garyvdm/bzr-search/authors into lp:bzr-search

Proposed by Gary van der Merwe
Status: Merged
Approved by: Robert Collins
Approved revision: 75
Merged at revision: 79
Proposed branch: lp:~garyvdm/bzr-search/authors
Merge into: lp:bzr-search
Diff against target: 68 lines
3 files modified
NEWS (+3/-0)
index.py (+10/-3)
tests/test_index.py (+8/-1)
To merge this branch: bzr merge lp:~garyvdm/bzr-search/authors
Reviewer Review Type Date Requested Status
Robert Collins Pending
Review via email: mp+14066@code.launchpad.net

This proposal supersedes a proposal from 2009-10-27.

To post a comment you must log in.
Revision history for this message
Gary van der Merwe (garyvdm) wrote : Posted in a previous version of this proposal

This patch enables bzr-search to index revision committers and authors.

Revision history for this message
Robert Collins (lifeless) wrote : Posted in a previous version of this proposal

 review: needsfixing

You've left a print statement in the code.

Here, getting rid of the temp variable makes the line long and require
wrapping - its less clean. You can also initialise commit terms to a set
directly rather than updating it with the initial set.

Other than that it looks ok.

- message_utf8 = revision.message.encode('utf8')
- commit_terms = _tokeniser_re.split(message_utf8)
+ commit_terms = set()
+ commit_terms.update(
+
set(_tokeniser_re.split(revision.message.encode('utf8'))))
+
+ for author in
[revision.committer]+revision.get_apparent_authors():
+ name, email = bzrlib.config.parse_username(author)
+ commit_terms.add(email.encode('utf8'))
+ commit_terms.update(
+ set(_tokeniser_re.split(name.encode('utf8'))))
+

review: Needs Fixing

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2008-12-02 22:36:33 +0000
3+++ NEWS 2009-10-28 04:45:20 +0000
4@@ -35,6 +35,9 @@
5 * Compatibility with split-inventory repositories (requires a bzrlib that
6 supports them). (Robert Collins)
7
8+ * Will now index revision committers and authors. (Bug 320236, Gary van
9+ der Merwe)
10+
11 BUGFIXES:
12
13 * Bug 293906 caused by changes in bzrlib has been fixed. This bug caused
14
15=== modified file 'index.py'
16--- index.py 2009-06-03 21:44:06 +0000
17+++ index.py 2009-10-28 04:45:20 +0000
18@@ -745,13 +745,20 @@
19 # components of a revision:
20 # parents - not indexed (but we could)
21 # commit message (done)
22- # author (todo)
23- # committer (todo)
24+ # author (done)
25+ # committer (done)
26 # properties (todo - names only?)
27 # bugfixes (a property we know how to read)
28 # other filters?
29 message_utf8 = revision.message.encode('utf8')
30- commit_terms = _tokeniser_re.split(message_utf8)
31+ commit_terms = set(_tokeniser_re.split(message_utf8))
32+
33+ for author in [revision.committer]+revision.get_apparent_authors():
34+ name, email = bzrlib.config.parse_username(author)
35+ commit_terms.add(email.encode('utf8'))
36+ name_utf8 = name.encode('utf8')
37+ commit_terms.update(set(_tokeniser_re.split(name_utf8)))
38+
39 for term in commit_terms:
40 if not term:
41 continue
42
43=== modified file 'tests/test_index.py'
44--- tests/test_index.py 2009-03-10 23:10:42 +0000
45+++ tests/test_index.py 2009-10-28 04:45:20 +0000
46@@ -237,7 +237,8 @@
47 )
48 rev_index = index.init_index(tree.branch)
49 # The double-space is a cheap smoke test for the tokeniser.
50- revid = tree.commit('first post')
51+ revid = tree.commit('first post', committer="Joe Soap <joe@acme.com>",
52+ authors=["Foo Baa <foo@example.com>"])
53 rev_index.index_revisions(tree.branch, [revid])
54 self.assertEqual(set([(revid,)]), set(rev_index.indexed_revisions()))
55 # reopen - it should retain the indexed revisions.
56@@ -260,6 +261,12 @@
57 ("working",): set([('f', 'an-id', revid)]),
58 ("tree",): set([('f', 'an-id', revid)]),
59 ('an-id', revid): set([('p', '', 'README.txt')]),
60+ ('Baa',): set([('r', '', revid)]),
61+ ('Foo',): set([('r', '', revid)]),
62+ ('Joe',): set([('r', '', revid)]),
63+ ('Soap',): set([('r', '', revid)]),
64+ ('foo@example.com',): set([('r', '', revid)]),
65+ ('joe@acme.com',): set([('r', '', revid)]),
66 }
67 all_terms = {}
68 for term, posting_list in rev_index.all_terms():

Subscribers

People subscribed via source and target branches

to all changes: