Merge into trunk-old : bug-402652 : Code : Bazaar

Status:

Rejected

Rejected by:

John A Meinel on 2009-10-06

Proposed branch:

lp:~lifeless/bzr/bug-402652

Merge into:

lp:~bzr/bzr/trunk-old

Diff against target:

94 lines

To merge this branch:

bzr merge lp:~lifeless/bzr/bug-402652

Critical

Fix Released

Link a bug report

Reviewer	Review Type	Date Requested	Status
John A Meinel		2009-09-01	Needs Resubmitting on 2009-10-06
Review via email: mp+10960@code.launchpad.net

Revision history for this message

Robert Collins (lifeless) wrote on 2009-09-01:

#

[MERGE] Fix bug 402652 by recompressing all texts that are streamed -
slightly slower at fetch, substantially faster and more compact at read.

--

Revision history for this message

John A Meinel (jameinel) wrote on 2009-09-01:

#

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> Robert Collins has proposed merging lp:~lifeless/bzr/bug-402652 into lp:bzr.
>
> Requested reviews:
> bzr-core (bzr-core)
>
> [MERGE] Fix bug 402652 by recompressing all texts that are streamed -
> slightly slower at fetch, substantially faster and more compact at read.
>
>

review: abstain

I personally feel that this is going too far overboard. If you are doing
something like branching all of launchpad you will have:
60,000 revisions of perfectly packed data
and
1,000 revisions of poorly packed data (some of which is probably autopacked)

Telling the system that it must repack all 61,000 revisions on every
fetch is (IMO) significant overkill.

I think the current tradeoff (unordered fetching) is a better tradeoff
than that, until we get to the point where we can do a partial packing
on-the-fly.

By the way, is there an open bug for asking Launchpad to pack the
mirrored area? (You could force a pack of hosted bzr+ssh://
repositories, but that wouldn't do anything for the http side.)

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkqdhGsACgkQJdeBCYSNAAO+hQCfUqkujjGCsliSCBA49dER50v5
JAMAoJJiTu9v8SWcqMdIkAdyn64Y6Rln
=CadJ
-----END PGP SIGNATURE-----

review: Abstain

Revision history for this message

John A Meinel (jameinel) wrote on 2009-09-01:

#

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> Robert Collins has proposed merging lp:~lifeless/bzr/bug-402652 into lp:bzr.
>
> Requested reviews:
> bzr-core (bzr-core)
>
> [MERGE] Fix bug 402652 by recompressing all texts that are streamed -
> slightly slower at fetch, substantially faster and more compact at read.
>
>

A few actual review comments, while I'm looking at the code:

1) The test needs a self.addCleanup(target.unlock)
2) The change breaks some of the direct tests of insert_record_stream in
bzrlib/tests/test_groupcompress.py

3) I'm not sure what the 'if not insert_keys' stuff is. Maybe to make
those tests pass? insert_keys is only filled if random_id is sent, to
ensure that the ids sent are genuinely not repeated. (I've run into some
problems with the chk fetching code in the past when I set random_id =
True).

If it is there because of those tests, I think that is incorrect. But
I'd like to understand what you are trying to do.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkqdj+kACgkQJdeBCYSNAANsKACfYbkJLci0fos+gOtvy72DTPEL
rAgAoIb7g7y28NWtztQbaZ9mMocmvjVO
=4vsm
-----END PGP SIGNATURE-----

Revision history for this message

John A Meinel (jameinel) wrote on 2009-10-06:

#

I'm pretty sure this is superseded by the change to streaming fetch that I did. So I'm marking this as rejected. If I'm missing something, please re-open/resubmit.

review: Needs Resubmitting

Revision history for this message

Robert Collins (lifeless) wrote on 2009-10-06:

#

On Tue, 2009-10-06 at 18:01 +0000, John A Meinel wrote:
> Review: Resubmit
> I'm pretty sure this is superseded by the change to streaming fetch that I did. So I'm marking this as rejected. If I'm missing something, please re-open/resubmit.

It was the same thing, earlier version of :P

So reject is fine.

-Rob

1	=== modified file 'NEWS'
2	--- NEWS 2009-08-30 23:51:10 +0000
3	+++ NEWS 2009-09-01 07:35:08 +0000
4	@@ -69,6 +69,13 @@
5	revisions that are in the fallback repository. (Regressed in 2.0rc1).
6	(John Arbash Meinel, #419241)
7
8	+* Fetches from 2a to 2a are now again requested in 'groupcompress' order,
9	+ and all texts are recombined appropriately. This doesn't reuse existing
10	+ groups, which will be measurable in some specific circumstances - an
11	+ approximately 25% overhead. However, doing this ensures high performance
12	+ reads subsequent to the fetch operation, which is the most common
13	+ operation: write once read many. (Robert Collins, #402652)
14	+
15	* Fix a segmentation fault when computing the ``merge_sort`` of a graph
16	that has a ghost in the mainline ancestry.
17	(John Arbash Meinel, #419241)
18
19	=== modified file 'bzrlib/groupcompress.py'
20	--- bzrlib/groupcompress.py 2009-08-30 21:34:42 +0000
21	+++ bzrlib/groupcompress.py 2009-09-01 07:35:08 +0000
22	@@ -1516,7 +1516,8 @@
23	# test_insert_record_stream_existing_keys fail for groupcompress and
24	# groupcompress-nograph, this needs to be revisited while addressing
25	# 'bzr branch' performance issues.
26	- for _ in self._insert_record_stream(stream, random_id=False):
27	+ for _ in self._insert_record_stream(stream, random_id=False,
28	+ reuse_blocks=False):
29	pass
30
31	def _insert_record_stream(self, stream, random_id=False, nostore_sha=None,
32	@@ -1580,7 +1581,7 @@
33	' but then inserted %r two times', record.key)
34	continue
35	inserted_keys.add(record.key)
36	- if reuse_blocks:
37	+ if not inserted_keys and reuse_blocks:
38	# If the reuse_blocks flag is set, check to see if we can just
39	# copy a groupcompress block as-is.
40	if record.storage_kind == 'groupcompress-block':
41
42	=== modified file 'bzrlib/repofmt/groupcompress_repo.py'
43	--- bzrlib/repofmt/groupcompress_repo.py 2009-08-24 19:34:13 +0000
44	+++ bzrlib/repofmt/groupcompress_repo.py 2009-09-01 07:35:08 +0000
45	@@ -932,7 +932,7 @@
46	super(GroupCHKStreamSource, self).__init__(from_repository, to_format)
47	self._revision_keys = None
48	self._text_keys = None
49	- # self._text_fetch_order = 'unordered'
50	+ self._text_fetch_order = 'groupcompress'
51	self._chk_id_roots = None
52	self._chk_p_id_roots = None
53
54	@@ -949,7 +949,7 @@
55	p_id_roots_set = set()
56	source_vf = self.from_repository.inventories
57	stream = source_vf.get_record_stream(inventory_keys,
58	- 'unordered', True)
59	+ 'groupcompress', True)
60	for record in stream:
61	if record.storage_kind == 'absent':
62	if allow_absent:
63
64	=== modified file 'bzrlib/tests/test_repository.py'
65	--- bzrlib/tests/test_repository.py 2009-08-17 23:15:55 +0000
66	+++ bzrlib/tests/test_repository.py 2009-09-01 07:35:08 +0000
67	@@ -683,6 +683,27 @@
68
69	class Test2a(TestCaseWithTransport):
70
71	+ def test_fetch_combines_groups(self):
72	+ builder = self.make_branch_builder('source', format='2a')
73	+ builder.start_series()
74	+ builder.build_snapshot('1', None, [
75	+ ('add', ('', 'root-id', 'directory', '')),
76	+ ('add', ('file', 'file-id', 'file', 'content\n'))])
77	+ builder.build_snapshot('2', ['1'], [
78	+ ('modify', ('file-id', 'content-2\n'))])
79	+ builder.finish_series()
80	+ source = builder.get_branch()
81	+ target = self.make_repository('target', format='2a')
82	+ target.fetch(source.repository)
83	+ target.lock_read()
84	+ details = target.texts._index.get_build_details(
85	+ [('file-id', '1',), ('file-id', '2',)])
86	+ file_1_details = details[('file-id', '1')]
87	+ file_2_details = details[('file-id', '2')]
88	+ # The index, and what to read off disk, should be the same for both
89	+ # versions of the file.
90	+ self.assertEqual(file_1_details[0][:3], file_2_details[0][:3])
91	+
92	def test_format_pack_compresses_True(self):
93	repo = self.make_repository('repo', format='2a')
94	self.assertTrue(repo._format.pack_compresses)

Bazaar

Merge lp:~lifeless/bzr/bug-402652 into lp:~bzr/bzr/trunk-old

Commit message

Description of the change

Preview Diff

Subscribers