Merge lp:~jelmer/bzr/repo-size into lp:bzr

Proposed by Jelmer Vernooij
Status: Rejected
Rejected by: Jelmer Vernooij
Proposed branch: lp:~jelmer/bzr/repo-size
Merge into: lp:bzr
Diff against target: 136 lines (+55/-1)
7 files modified
NEWS (+3/-0)
bzrlib/remote.py (+10/-0)
bzrlib/repository.py (+6/-1)
bzrlib/smart/repository.py (+13/-0)
bzrlib/smart/request.py (+3/-0)
bzrlib/tests/per_repository/test_statistics.py (+10/-0)
bzrlib/tests/test_smart.py (+10/-0)
To merge this branch: bzr merge lp:~jelmer/bzr/repo-size
Reviewer Review Type Date Requested Status
Robert Collins (community) Needs Fixing
Review via email: mp+16582@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Jelmer Vernooij (jelmer) wrote :

This patch adds a custom implementation of Repository.__len__ that
returns the number of revisions in the repository.

This is mainly convenient for foreign branch plugins, which do not
provide Repository.revisions. Currently Repository.gather_stats() relies
on len(Repository.revisions), so foreign branch plugins have to override
it.

Cheers,

Jelmer

Revision history for this message
Robert Collins (lifeless) wrote :

On Fri, 2009-12-25 at 15:36 +0000, Jelmer Vernooij wrote:
> Jelmer Vernooij has proposed merging lp:~jelmer/bzr/repo-size into lp:bzr.
>
> Requested reviews:
> bzr-core (bzr-core)
> Related bugs:
> #123653 "bzr info" depends on private Repository._revision_store
> https://bugs.launchpad.net/bugs/123653
>
>
> This patch adds a custom implementation of Repository.__len__ that
> returns the number of revisions in the repository.
>
> This is mainly convenient for foreign branch plugins, which do not
> provide Repository.revisions. Currently Repository.gather_stats() relies
> on len(Repository.revisions), so foreign branch plugins have to override
> it.

Generally we avoid giving objects that are not dicts or sequences a
__len__, and I think its awfully prone to confusion here too, so we
should not do that. I thought that this had been written down somewhere
- you may wish to briefly look for it in the dev docs and if not found
add a note to this effect: only make things look like containers if they
feel like them too. (that is that you won't be surprised as a user).

Rather, I suggest you put the revision count into the output of
gather_stats: its not guaranteed to be cheap for all repositories, so
there is every reason to make it optional.

 review needs-fixing

review: Needs Fixing
Revision history for this message
Jelmer Vernooij (jelmer) wrote :

On Fri, 2009-12-25 at 19:21 +0000, Robert Collins wrote:
> Review: Needs Fixing
> On Fri, 2009-12-25 at 15:36 +0000, Jelmer Vernooij wrote:
> > Jelmer Vernooij has proposed merging lp:~jelmer/bzr/repo-size into lp:bzr.
> >
> > Requested reviews:
> > bzr-core (bzr-core)
> > Related bugs:
> > #123653 "bzr info" depends on private Repository._revision_store
> > https://bugs.launchpad.net/bugs/123653
> >
> >
> > This patch adds a custom implementation of Repository.__len__ that
> > returns the number of revisions in the repository.
> >
> > This is mainly convenient for foreign branch plugins, which do not
> > provide Repository.revisions. Currently Repository.gather_stats() relies
> > on len(Repository.revisions), so foreign branch plugins have to override
> > it.
>
> Generally we avoid giving objects that are not dicts or sequences a
> __len__, and I think its awfully prone to confusion here too, so we
> should not do that. I thought that this had been written down somewhere
> - you may wish to briefly look for it in the dev docs and if not found
> add a note to this effect: only make things look like containers if they
> feel like them too. (that is that you won't be surprised as a user).
>
> Rather, I suggest you put the revision count into the output of
> gather_stats: its not guaranteed to be cheap for all repositories, so
> there is every reason to make it optional.
As I mentioned in my merge request, it's already part of gather_stats()
but I'd like to have it in a separate function so I can override it for
subclasses. Would it be acceptable if I just renamed __len__ to
_get_revision_count() ?

Cheers,

Jelmer

Revision history for this message
Robert Collins (lifeless) wrote :

On Sat, 2009-12-26 at 00:51 +0000, Jelmer Vernooij wrote:

> > Generally we avoid giving objects that are not dicts or sequences a
> > __len__, and I think its awfully prone to confusion here too, so we
> > should not do that. I thought that this had been written down somewhere
> > - you may wish to briefly look for it in the dev docs and if not found
> > add a note to this effect: only make things look like containers if they
> > feel like them too. (that is that you won't be surprised as a user).
> >
> > Rather, I suggest you put the revision count into the output of
> > gather_stats: its not guaranteed to be cheap for all repositories, so
> > there is every reason to make it optional.
> As I mentioned in my merge request, it's already part of gather_stats()
> but I'd like to have it in a separate function so I can override it for
> subclasses. Would it be acceptable if I just renamed __len__ to
> _get_revision_count() ?

Or you could just delete printing the number of revisions. I don't think
a method on repository is a particularly good idea; if anything a method
on repository.revisions. However, gather_stats is *intended* to be
overridden, trying to avoid overriding it is odd to me :).

Another alternative would be to make the check for whether to attempt to
calculate the revision count more sophisticated.

-Rob

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

Le samedi 26 décembre 2009 à 19:51 +0000, Robert Collins a écrit :
> On Sat, 2009-12-26 at 00:51 +0000, Jelmer Vernooij wrote:
> > > Generally we avoid giving objects that are not dicts or sequences a
> > > __len__, and I think its awfully prone to confusion here too, so we
> > > should not do that. I thought that this had been written down somewhere
> > > - you may wish to briefly look for it in the dev docs and if not found
> > > add a note to this effect: only make things look like containers if they
> > > feel like them too. (that is that you won't be surprised as a user).
> > >
> > > Rather, I suggest you put the revision count into the output of
> > > gather_stats: its not guaranteed to be cheap for all repositories, so
> > > there is every reason to make it optional.
> > As I mentioned in my merge request, it's already part of gather_stats()
> > but I'd like to have it in a separate function so I can override it for
> > subclasses. Would it be acceptable if I just renamed __len__ to
> > _get_revision_count() ?
> Or you could just delete printing the number of revisions. I don't think
> a method on repository is a particularly good idea; if anything a method
> on repository.revisions.
Except the foreign repository implementations don't necessarily
have .revisions set to something sane.

> However, gather_stats is *intended* to be
> overridden, trying to avoid overriding it is odd to me :).
>
> Another alternative would be to make the check for whether to attempt to
> calculate the revision count more sophisticated.
So, the reason I am trying to avoid overriding it is because it is quite
a big function and I would like to avoid duplicating all of that code in
all of the foreign branch plugins.

I think printing the number of revisions in "bzr info" is actually quite
useful, I've used it on several occasions.

Cheers,

Jelmer

Revision history for this message
Martin Pool (mbp) wrote :

2009/12/26 Robert Collins <email address hidden>:
> Generally we avoid giving objects that are not dicts or sequences a
> __len__, and I think its awfully prone to confusion here too, so we
> should not do that. I thought that this had been written down somewhere
> - you may wish to briefly look for it in the dev docs and if not found
> add a note to this effect: only make things look like containers if they
> feel like them too. (that is that you won't be surprised as a user).

+1

--
Martin <http://launchpad.net/~mbp/>

Revision history for this message
Robert Collins (lifeless) wrote :

> > However, gather_stats is *intended* to be
> > overridden, trying to avoid overriding it is odd to me :).
> >
> > Another alternative would be to make the check for whether to attempt to
> > calculate the revision count more sophisticated.
> So, the reason I am trying to avoid overriding it is because it is quite
> a big function and I would like to avoid duplicating all of that code in
> all of the foreign branch plugins.
>
> I think printing the number of revisions in "bzr info" is actually quite
> useful, I've used it on several occasions.

So, its expensive for you because there isn't a size/len/area style function on .revisions? How about we add one there?

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

Well, we don't have .revisions in the case of e.g. subversion, Repository.revisions is just set to None.

Revision history for this message
Robert Collins (lifeless) wrote :

I'm basically very hesitant to be adding more methods to Repository as a whole. Its overly big (still).

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

> I'm basically very hesitant to be adding more methods to Repository as a
> whole. Its overly big (still).
Darn, that's a good point. I'll look at alternatives.

Revision history for this message
Vincent Ladeuil (vila) wrote :

>>>>> "Jelmer" == Jelmer Vernooij <email address hidden> writes:

    Jelmer> Well, we don't have .revisions in the case of e.g. subversion, Repository.revisions is just set to None.

Set it to [] instead ;-p

    Vincent

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'NEWS'
--- NEWS 2009-12-25 13:33:03 +0000
+++ NEWS 2009-12-25 15:36:19 +0000
@@ -63,6 +63,9 @@
63 CamelCase. For the features that were more likely to be used, we added a63 CamelCase. For the features that were more likely to be used, we added a
64 deprecation thunk, but not all. (John Arbash Meinel)64 deprecation thunk, but not all. (John Arbash Meinel)
6565
66* ``Repository.__len__`` will now return the number of revisions in
67 the repository. (#123653, Jelmer Vernooij)
68
66* ``WorkingTree.update`` implementations must now accept a ``revision``69* ``WorkingTree.update`` implementations must now accept a ``revision``
67 parameter.70 parameter.
6871
6972
=== modified file 'bzrlib/remote.py'
--- bzrlib/remote.py 2009-12-15 20:32:34 +0000
+++ bzrlib/remote.py 2009-12-25 15:36:19 +0000
@@ -886,6 +886,16 @@
886 parents_provider = self._make_parents_provider(other_repository)886 parents_provider = self._make_parents_provider(other_repository)
887 return graph.Graph(parents_provider)887 return graph.Graph(parents_provider)
888888
889 def __len__(self):
890 """See Repository.__len__."""
891 path = self.bzrdir._path_for_remote_call(self._client)
892 response_tuple, response_handler = self._call_expecting_body(
893 'Repository.__len__', path)
894 if response_tuple[0] != 'ok':
895 raise errors.UnexpectedSmartServerResponse(response_tuple)
896 body = response_handler.read_body_bytes()
897 return int(body)
898
889 def gather_stats(self, revid=None, committers=None):899 def gather_stats(self, revid=None, committers=None):
890 """See Repository.gather_stats()."""900 """See Repository.gather_stats()."""
891 path = self.bzrdir._path_for_remote_call(self._client)901 path = self.bzrdir._path_for_remote_call(self._client)
892902
=== modified file 'bzrlib/repository.py'
--- bzrlib/repository.py 2009-12-17 10:01:25 +0000
+++ bzrlib/repository.py 2009-12-25 15:36:19 +0000
@@ -926,6 +926,11 @@
926 r'.* revision="(?P<revision_id>[^"]+)"'926 r'.* revision="(?P<revision_id>[^"]+)"'
927 )927 )
928928
929 @needs_read_lock
930 def __len__(self):
931 """Determine the number of revisions in the repository."""
932 return len(self.revisions.keys())
933
929 def abort_write_group(self, suppress_errors=False):934 def abort_write_group(self, suppress_errors=False):
930 """Commit the contents accrued within the current write group.935 """Commit the contents accrued within the current write group.
931936
@@ -1471,7 +1476,7 @@
1471 # XXX: do we want to __define len__() ?1476 # XXX: do we want to __define len__() ?
1472 # Maybe the versionedfiles object should provide a different1477 # Maybe the versionedfiles object should provide a different
1473 # method to get the number of keys.1478 # method to get the number of keys.
1474 result['revisions'] = len(self.revisions.keys())1479 result['revisions'] = len(self)
1475 # result['size'] = t1480 # result['size'] = t
1476 return result1481 return result
14771482
14781483
=== modified file 'bzrlib/smart/repository.py'
--- bzrlib/smart/repository.py 2009-09-03 00:33:35 +0000
+++ bzrlib/smart/repository.py 2009-12-25 15:36:19 +0000
@@ -328,6 +328,19 @@
328 return SuccessfulSmartServerResponse(('no', ))328 return SuccessfulSmartServerResponse(('no', ))
329329
330330
331class SmartServerRepositorySize(SmartServerRepositoryRequest):
332
333 def do_repository_request(self, repository):
334 """Return the number of revisions in a repository.
335
336 :param repository: The repository to query in.
337
338 :return: A SmartServerResponse ('ok',), a encoded body containing
339 the number of revisions.
340 """
341 return SuccessfulSmartServerResponse(('ok', ), "%d" % len(repository))
342
343
331class SmartServerRepositoryGatherStats(SmartServerRepositoryRequest):344class SmartServerRepositoryGatherStats(SmartServerRepositoryRequest):
332345
333 def do_repository_request(self, repository, revid, committers):346 def do_repository_request(self, repository, revid, committers):
334347
=== modified file 'bzrlib/smart/request.py'
--- bzrlib/smart/request.py 2009-12-21 17:00:29 +0000
+++ bzrlib/smart/request.py 2009-12-25 15:36:19 +0000
@@ -592,6 +592,9 @@
592request_handlers.register_lazy('Repository.gather_stats',592request_handlers.register_lazy('Repository.gather_stats',
593 'bzrlib.smart.repository',593 'bzrlib.smart.repository',
594 'SmartServerRepositoryGatherStats')594 'SmartServerRepositoryGatherStats')
595request_handlers.register_lazy('Repository.__len__',
596 'bzrlib.smart.repository',
597 'SmartServerRepositorySize')
595request_handlers.register_lazy('Repository.get_parent_map',598request_handlers.register_lazy('Repository.get_parent_map',
596 'bzrlib.smart.repository',599 'bzrlib.smart.repository',
597 'SmartServerRepositoryGetParentMap')600 'SmartServerRepositoryGetParentMap')
598601
=== modified file 'bzrlib/tests/per_repository/test_statistics.py'
--- bzrlib/tests/per_repository/test_statistics.py 2009-03-23 14:59:43 +0000
+++ bzrlib/tests/per_repository/test_statistics.py 2009-12-25 15:36:19 +0000
@@ -62,3 +62,13 @@
62 'revisions': 062 'revisions': 0
63 },63 },
64 stats)64 stats)
65
66 def test_len(self):
67 tree = self.make_branch_and_memory_tree('.')
68 self.assertEquals(0, len(tree.branch.repository))
69 tree.lock_write()
70 tree.add('')
71 rev1 = tree.commit('first post', committer='person 1',
72 timestamp=1170491381, timezone=0)
73 tree.unlock()
74 self.assertEquals(1, len(tree.branch.repository))
6575
=== modified file 'bzrlib/tests/test_smart.py'
--- bzrlib/tests/test_smart.py 2009-10-29 00:16:50 +0000
+++ bzrlib/tests/test_smart.py 2009-12-25 15:36:19 +0000
@@ -1472,6 +1472,16 @@
1472 rev_id_utf8, 'yes'))1472 rev_id_utf8, 'yes'))
14731473
14741474
1475class TestSmartServerRepositorySize(tests.TestCaseWithMemoryTransport):
1476
1477 def test_empty(self):
1478 backing = self.get_transport()
1479 request = smart.repository.SmartServerRepositorySize(backing)
1480 repository = self.make_repository('.')
1481 self.assertEqual(SmartServerResponse(('ok', ), "0"),
1482 request.execute(''))
1483
1484
1475class TestSmartServerRepositoryIsShared(tests.TestCaseWithMemoryTransport):1485class TestSmartServerRepositoryIsShared(tests.TestCaseWithMemoryTransport):
14761486
1477 def test_is_shared(self):1487 def test_is_shared(self):