Merge lp:~vila/bzr/374726-gc-annotate into lp:~bzr/bzr/trunk-old

Proposed by Vincent Ladeuil on 2009-05-25
Status: Merged
Merged at revision: not available
Proposed branch: lp:~vila/bzr/374726-gc-annotate
Merge into: lp:~bzr/bzr/trunk-old
Diff against target: 53 lines
To merge this branch: bzr merge lp:~vila/bzr/374726-gc-annotate
Reviewer Review Type Date Requested Status
Ian Clatworthy 2009-05-25 Abstain on 2009-05-26
Review via email: mp+6785@code.launchpad.net
To post a comment you must log in.
Vincent Ladeuil (vila) wrote :

This pach fixes the most blatant regression for any gc repository. Whether the repository is packed or not doesn't matter anymore.

There is still a performance regression compared to knit repositories but far more limited (at most 2x) and related to gc different choices for deltas (leading to different reannotate intermediate calls, the same annotations being finally produced anyway).

Since we are not satisfied with annotate performance in either case, I'd like some feedback about whether it's worth spending time on trying to catch up with knit here (inverstigating with jam showed no obvious way to achieve that though) or go for implementing an annotation cache (which is out of this bug scope).

Ian Clatworthy (ian-clatworthy) wrote :

Just a quick thought: I'd prefer we fixed the comment (or code) than add another comment highlighting that they differ. Assuming you're at the sprint with John, poke him about what the correct comment ought to be. :-)

review: Abstain
John A Meinel (jameinel) wrote :

Vincent Ladeuil wrote:
> Vincent Ladeuil has proposed merging lp:~vila/bzr/374726-gc-annotate into lp:bzr.
>
> Requested reviews:
> bzr-core (bzr-core)
>
> This pach fixes the most blatant regression for any gc repository. Whether the repository is packed or not doesn't matter anymore.
>
> There is still a performance regression compared to knit repositories but far more limited (at most 2x) and related to gc different choices for deltas (leading to different reannotate intermediate calls, the same annotations being finally produced anyway).
>
> Since we are not satisfied with annotate performance in either case, I'd like some feedback about whether it's worth spending time on trying to catch up with knit here (inverstigating with jam showed no obvious way to achieve that though) or go for implementing an annotation cache (which is out of this bug scope).
>
>
>

 Review approve

=== modified file 'bzrlib/repofmt/groupcompress_repo.py'
--- bzrlib/repofmt/groupcompress_repo.py 2009-04-20 08:37:32 +0000
+++ bzrlib/repofmt/groupcompress_repo.py 2009-05-25 19:04:59 +0000
@@ -89,6 +89,9 @@
             index_builder_class(reference_lists=1),
             # Texts: compression and per file graph, for all fileids -
so two
             # reference lists and two elements in the key tuple.
+
+ # XXX: The comment says two reference_lists, yet the param
set it
+ # to 1
             index_builder_class(reference_lists=1, key_elements=2),
             # Signatures: Just blobs to store, no compression, no parents
             # listing.

^- The original comment is just wrong. There is no "compression" parent
listed in groupcompress formats. So there is only the per-file graph,
and thus "reference_lists=1" is correct.

John
=:->

Vincent Ladeuil (vila) wrote :

>>>>> "Ian" == Ian Clatworthy <email address hidden> writes:

    Ian> Review: Abstain
    Ian> Just a quick thought: I'd prefer we fixed the comment (or code)
    Ian> than add another comment highlighting that they differ. Assuming
    Ian> you're at the sprint with John, poke him about what the correct
    Ian> comment ought to be. :-)

The intent was exactly that :) And, indeed, John reviewed it (mail is on
its painful way :)

    Vincent

lp:~vila/bzr/374726-gc-annotate updated on 2009-05-26
4373. By Vincent Ladeuil on 2009-05-26

Fix comment as per John's review.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bzrlib/groupcompress.py'
2--- bzrlib/groupcompress.py 2009-04-22 17:18:45 +0000
3+++ bzrlib/groupcompress.py 2009-05-26 13:35:09 +0000
4@@ -1018,15 +1018,19 @@
5 else:
6 keys = [key]
7 parent_map = {key:()}
8+ # So we used Graph(self) to load the parent_map, but now that we have
9+ # it, we can just query the parent map directly, so create a new Graph
10+ # object
11+ graph = _mod_graph.Graph(_mod_graph.DictParentsProvider(parent_map))
12 head_cache = _mod_graph.FrozenHeadsCache(graph)
13 parent_cache = {}
14 reannotate = annotate.reannotate
15 for record in self.get_record_stream(keys, 'topological', True):
16 key = record.key
17- chunks = osutils.chunks_to_lines(record.get_bytes_as('chunked'))
18+ lines = osutils.chunks_to_lines(record.get_bytes_as('chunked'))
19 parent_lines = [parent_cache[parent] for parent in parent_map[key]]
20 parent_cache[key] = list(
21- reannotate(parent_lines, chunks, key, None, head_cache))
22+ reannotate(parent_lines, lines, key, None, head_cache))
23 return parent_cache[key]
24
25 def check(self, progress_bar=None):
26
27=== modified file 'bzrlib/knit.py'
28--- bzrlib/knit.py 2009-04-29 09:50:57 +0000
29+++ bzrlib/knit.py 2009-05-26 13:35:09 +0000
30@@ -3407,7 +3407,7 @@
31 fulltext.)
32
33 :return: A list of (key, index_memo) records, suitable for
34- passing to read_records_iter to start reading in the raw data fro/
35+ passing to read_records_iter to start reading in the raw data from
36 the pack file.
37 """
38 if key in self._annotated_lines:
39
40=== modified file 'bzrlib/repofmt/groupcompress_repo.py'
41--- bzrlib/repofmt/groupcompress_repo.py 2009-04-20 08:37:32 +0000
42+++ bzrlib/repofmt/groupcompress_repo.py 2009-05-26 13:35:09 +0000
43@@ -87,8 +87,8 @@
44 # have a regular 2-list index giving parents and compression
45 # source.
46 index_builder_class(reference_lists=1),
47- # Texts: compression and per file graph, for all fileids - so two
48- # reference lists and two elements in the key tuple.
49+ # Texts: per file graph, for all fileids - so one reference list
50+ # and two elements in the key tuple.
51 index_builder_class(reference_lists=1, key_elements=2),
52 # Signatures: Just blobs to store, no compression, no parents
53 # listing.