Bazaar

Merge lp:~vila/bzr/374726-gc-annotate into lp:~bzr/bzr/trunk-old

374726-gc-annotate
Merge into trunk-old

Proposed by Vincent Ladeuil on 2009-05-25

Status:

Merged

Merged at revision:

not available

Proposed branch:

lp:~vila/bzr/374726-gc-annotate

Merge into:

lp:~bzr/bzr/trunk-old

Diff against target:

53 lines

To merge this branch:

bzr merge lp:~vila/bzr/374726-gc-annotate

High

Fix Released

Link a bug report

Reviewer	Review Type	Date Requested	Status
Ian Clatworthy		2009-05-25	Abstain on 2009-05-26
Review via email: mp+6785@code.launchpad.net

Revision history for this message

Vincent Ladeuil (vila) wrote on 2009-05-25:

This pach fixes the most blatant regression for any gc repository. Whether the repository is packed or not doesn't matter anymore.

There is still a performance regression compared to knit repositories but far more limited (at most 2x) and related to gc different choices for deltas (leading to different reannotate intermediate calls, the same annotations being finally produced anyway).

Since we are not satisfied with annotate performance in either case, I'd like some feedback about whether it's worth spending time on trying to catch up with knit here (inverstigating with jam showed no obvious way to achieve that though) or go for implementing an annotation cache (which is out of this bug scope).

Revision history for this message

Ian Clatworthy (ian-clatworthy) wrote on 2009-05-26:

Just a quick thought: I'd prefer we fixed the comment (or code) than add another comment highlighting that they differ. Assuming you're at the sprint with John, poke him about what the correct comment ought to be. :-)

review: Abstain

Revision history for this message

John A Meinel (jameinel) wrote on 2009-05-26:

Vincent Ladeuil wrote:
> Vincent Ladeuil has proposed merging lp:~vila/bzr/374726-gc-annotate into lp:bzr.
>
> Requested reviews:
> bzr-core (bzr-core)
>
> This pach fixes the most blatant regression for any gc repository. Whether the repository is packed or not doesn't matter anymore.
>
> There is still a performance regression compared to knit repositories but far more limited (at most 2x) and related to gc different choices for deltas (leading to different reannotate intermediate calls, the same annotations being finally produced anyway).
>
> Since we are not satisfied with annotate performance in either case, I'd like some feedback about whether it's worth spending time on trying to catch up with knit here (inverstigating with jam showed no obvious way to achieve that though) or go for implementing an annotation cache (which is out of this bug scope).
>
>
>

Review approve

=== modified file 'bzrlib/repofmt/groupcompress_repo.py'
--- bzrlib/repofmt/groupcompress_repo.py 2009-04-20 08:37:32 +0000
+++ bzrlib/repofmt/groupcompress_repo.py 2009-05-25 19:04:59 +0000
@@ -89,6 +89,9 @@
             index_builder_class(reference_lists=1),
             # Texts: compression and per file graph, for all fileids -
so two
             # reference lists and two elements in the key tuple.
+
+ # XXX: The comment says two reference_lists, yet the param
set it
+ # to 1
             index_builder_class(reference_lists=1, key_elements=2),
             # Signatures: Just blobs to store, no compression, no parents
             # listing.

^- The original comment is just wrong. There is no "compression" parent
listed in groupcompress formats. So there is only the per-file graph,
and thus "reference_lists=1" is correct.

John
=:->

Revision history for this message

Vincent Ladeuil (vila) wrote on 2009-05-26:

>>>>> "Ian" == Ian Clatworthy <email address hidden> writes:

    Ian> Review: Abstain
    Ian> Just a quick thought: I'd prefer we fixed the comment (or code)
    Ian> than add another comment highlighting that they differ. Assuming
    Ian> you're at the sprint with John, poke him about what the correct
    Ian> comment ought to be. :-)

The intent was exactly that :) And, indeed, John reviewed it (mail is on
its painful way :)

Vincent

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Aaron Bentley

Denys Duchier

Eric Siegerman

Gary van der Merwe

Jelmer Vernooij

John Szakmeister

Jonathan Lange

Marius Kruger

Martin Albisetti

Matt Nordhoff

Paul Hummer

SuperMMX

Talden

Vincent Ladeuil

Yoshinori Sano

to status/vote changes:

Alexander Belchenko

Martin Eisenhardt

Tim Penhey

1	=== modified file 'bzrlib/groupcompress.py'
2	--- bzrlib/groupcompress.py 2009-04-22 17:18:45 +0000
3	+++ bzrlib/groupcompress.py 2009-05-26 13:35:09 +0000
4	@@ -1018,15 +1018,19 @@
5	else:
6	keys = [key]
7	parent_map = {key:()}
8	+ # So we used Graph(self) to load the parent_map, but now that we have
9	+ # it, we can just query the parent map directly, so create a new Graph
10	+ # object
11	+ graph = _mod_graph.Graph(_mod_graph.DictParentsProvider(parent_map))
12	head_cache = _mod_graph.FrozenHeadsCache(graph)
13	parent_cache = {}
14	reannotate = annotate.reannotate
15	for record in self.get_record_stream(keys, 'topological', True):
16	key = record.key
17	- chunks = osutils.chunks_to_lines(record.get_bytes_as('chunked'))
18	+ lines = osutils.chunks_to_lines(record.get_bytes_as('chunked'))
19	parent_lines = [parent_cache[parent] for parent in parent_map[key]]
20	parent_cache[key] = list(
21	- reannotate(parent_lines, chunks, key, None, head_cache))
22	+ reannotate(parent_lines, lines, key, None, head_cache))
23	return parent_cache[key]
24
25	def check(self, progress_bar=None):
26
27	=== modified file 'bzrlib/knit.py'
28	--- bzrlib/knit.py 2009-04-29 09:50:57 +0000
29	+++ bzrlib/knit.py 2009-05-26 13:35:09 +0000
30	@@ -3407,7 +3407,7 @@
31	fulltext.)
32
33	:return: A list of (key, index_memo) records, suitable for
34	- passing to read_records_iter to start reading in the raw data fro/
35	+ passing to read_records_iter to start reading in the raw data from
36	the pack file.
37	"""
38	if key in self._annotated_lines:
39
40	=== modified file 'bzrlib/repofmt/groupcompress_repo.py'
41	--- bzrlib/repofmt/groupcompress_repo.py 2009-04-20 08:37:32 +0000
42	+++ bzrlib/repofmt/groupcompress_repo.py 2009-05-26 13:35:09 +0000
43	@@ -87,8 +87,8 @@
44	# have a regular 2-list index giving parents and compression
45	# source.
46	index_builder_class(reference_lists=1),
47	- # Texts: compression and per file graph, for all fileids - so two
48	- # reference lists and two elements in the key tuple.
49	+ # Texts: per file graph, for all fileids - so one reference list
50	+ # and two elements in the key tuple.
51	index_builder_class(reference_lists=1, key_elements=2),
52	# Signatures: Just blobs to store, no compression, no parents
53	# listing.

Bazaar

Merge lp:~vila/bzr/374726-gc-annotate into lp:~bzr/bzr/trunk-old

Commit message

Description of the change

Preview Diff

Subscribers