Merge lp:~spiv/bzr/smarter-index-search into lp:bzr
| Status: | Merged |
|---|---|
| Approved by: | Andrew Bennetts on 2010-04-08 |
| Approved revision: | 5094 |
| Merged at revision: | not available |
| Proposed branch: | lp:~spiv/bzr/smarter-index-search |
| Merge into: | lp:bzr |
| Diff against target: |
365 lines (+176/-29) 5 files modified
NEWS (+6/-0) bzrlib/index.py (+95/-7) bzrlib/repofmt/pack_repo.py (+14/-22) bzrlib/tests/per_pack_repository.py (+17/-0) bzrlib/tests/test_index.py (+44/-0) |
| To merge this branch: | bzr merge lp:~spiv/bzr/smarter-index-search |
| Related bugs: |
| Reviewer | Review Type | Date Requested | Status |
|---|---|---|---|
| Robert Collins (community) | 2010-03-26 | Approve on 2010-04-05 | |
| Martin Packman (community) | Approve on 2010-03-31 | ||
| John A Meinel | 2010-03-18 | Pending | |
|
Review via email:
|
|||
This proposal supersedes a proposal from 2010-03-12.
Description of the Change
Optimise index lookups in repositories with many pack files.
First, the headline: this greatly improves "bzr pull" of one new revision of grub from savannah by HTTP (as reported on the mailing list, which has further analysis):
bzr.dev: 2424kB (50.2kB/s r:2395kB w:30kB)
this patch: 1034kB (43.3kB/s r:1022kB w:12kB)
Given that the pack data transferred is 701266 bytes (which itself seems quite large for such a small change...), that brings the index-searching overhead from 2.42x to 0.45x of the bytes read. It also halves the wall-clock time :)
That repo has I think 14 packs, and bzr.dev tries 11 indices for each of rix, iix, etc before finding the data it needs for that fetch.
There are two parts to this change:
1) when a CombinedGraphIndex performs a lookup, it shuffles the index or indices that contained the records to the front of self._indices on the assumption that future lookups should try those first.
2) propagates that reordering to the other CombinedGraphIndex objects from the same pack collection. This is done by a) associating a name (the pack name) with the elements of CombinedGraphIndex, and b) linking the revisions/
I've been pretty conservative with API changes: the new behaviour is only activated by optional keyword arguments, so existing uses of CombinedGraphIndex should see no change of behaviour (including no improvement). This is to make it as easy as possible to backport this change to 2.1 and 2.0 if we choose to.
I think this change needs some tests before it's truly ready to merge, but it's getting to the end of my work week and I think this code is ready for feedback, so here it is!
| John A Meinel (jameinel) wrote : | # |
| John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Andrew Bennetts wrote:
> Andrew Bennetts has proposed merging lp:~spiv/bzr/smarter-index-search into lp:bzr.
>
> Requested reviews:
> bzr-core (bzr-core)
>
>
> Optimise index lookups in repositories with many pack files.
>
> First, the headline: this greatly improves "bzr pull" of one new revision of grub from savannah by HTTP (as reported on the mailing list, which has further analysis):
>
> bzr.dev: 2424kB (50.2kB/s r:2395kB w:30kB)
> this patch: 1034kB (43.3kB/s r:1022kB w:12kB)
>
> Given that the pack data transferred is 701266 bytes (which itself seems quite large for such a small change...), that brings the index-searching overhead from 2.42x to 0.45x of the bytes read. It also halves the wall-clock time :)
>
> That repo has I think 14 packs, and bzr.dev tries 11 indices for each of rix, iix, etc before finding the data it needs for that fetch.
>
> There are two parts to this change:
>
> 1) when a CombinedGraphIndex performs a lookup, it shuffles the index or indices that contained the records to the front of self._indices on the assumption that future lookups should try those first.
> 2) propagates that reordering to the other CombinedGraphIndex objects from the same pack collection. This is done by a) associating a name (the pack name) with the elements of CombinedGraphIndex, and b) linking the revisions/
>
> I've been pretty conservative with API changes: the new behaviour is only activated by optional keyword arguments, so existing uses of CombinedGraphIndex should see no change of behaviour (including no improvement). This is to make it as easy as possible to backport this change to 2.1 and 2.0 if we choose to.
>
> I think this change needs some tests before it's truly ready to merge, but it's getting to the end of my work week and I think this code is ready for feedback, so here it is!
>
I forgot to vote.
review: needs_fixing
Because it certainly should have some amount of testing. If only to
prevent future regressions (like we already have :). Though you don't
have to test it at the "get_stream()" level.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAku
0ykAoM7AAlpwfv/
=90G5
-----END PGP SIGNATURE-----
| Andrew Bennetts (spiv) wrote : | # |
John A Meinel wrote:
[...]
> Thanks for doing this. I did the work in Packer, but it is nice to have
> it here.
Ah! I thought we had something like this. I'll see what code and ideas
I can reuse.
> I would tend to set "auto_reorder" to always on. I don't see it ever
> making things worse, and there should only be a small overhead for the
> reordering. As such, I would also get rid of the constructor flag.
Hmm, ok. If I was just targetting bzr.dev I would have done this
already, but if you're confident it won't cause unexpected issues I'm
happy to make it unconditional (including in backports).
> Doing it by names is ok, I think Index already has _name for most
> indices. You could also order by "access_tuple()" which I believe is a
> public API and returns (transport, name) to the original .pack file. The
> *nice* thing about doing that is that you don't have to introduce any
> new apis here. But if it isn't as clean, then don't worry too much about it.
_name is really a path for _transport, and as such includes the suffix.
So I don't think it's as clean either code-wise or conceptually to rely
on that. And as you noticed later access_tuple is on AggregateIndex, so
it's not readily available either.
> If it isn't too much overhead, you might consider:
>
> keys_remove = keys.remove
> # Be careful, if keys starts empty, we would break, and get a name error
> index_hit_count = 0
> for index in self._indices:
> if not keys:
> break
> index_hit_count = 0
> for node in index.iter_
> keys_remove(
> yield node
> index_hit_count += 1
>
> if index_hit_count:
> hit_indices.
>
> hit_indices.
>
> self._move_
>
>
> The nice thing about it, is that it is adaptive. In that if you start a
> search, and hit it in index 1, then you keep looking there, but if the
> keys start showing up in index 2, then you'll switch. The only other bit
> of logic that I could think of is:
I don't quite follow you here. Wouldn't my logic switch to preferring
index 2 anyway?
It's not clear to me that ordering the hit_indices by hit count is
better.
> if index_hit_count == index.key_count():
> index_hit_count = 0 (or maybe 1)
>
> I realize to do this correctly, you'd need to track hit count between
> calls, which isn't realistic. However, that basic check will work for
> pack files that only have 1 or 2 keys that are found quickly. (And
> commit creates indexes with only 1 entry, so it is fairly common.) Just
> a thought, though.
This is an interesting idea. Although perhaps those look ups will
already fail very quickly because at that point that entire index has
already been read and cached? I don't think I'll do this in this patch
as I think it'd need more investigation and measurement to justify, but
it might make a good incremental improvement later.
> + # Tell all the CombinedGraphIndex objects about each other, so
> they can
> + # share hints about which pack names to search first.
> + all_combined = [agg_idx.
> + for combined_idx in all_combined:
> + ...
| Andrew Bennetts (spiv) wrote : | # |
Andrew Bennetts wrote:
> John A Meinel wrote:
[...]
> > The nice thing about it, is that it is adaptive. In that if you start a
> > search, and hit it in index 1, then you keep looking there, but if the
> > keys start showing up in index 2, then you'll switch. The only other bit
> > of logic that I could think of is:
>
> I don't quite follow you here. Wouldn't my logic switch to preferring
> index 2 anyway?
On reflection, I *think* what you mean is if a call to iter_entries (via
get_parent_map or otherwise) queries for [key_from_1, key_from_2,
key_from_2], my logic will keep the index order as [1, 2], but it might
be better to switch them to [2, 1]. (I initially thought that you were
talking about a search conducted over multiple calls to iter_entries,
not just one.)
That could be true, but I'm not sure it would really make a big
difference in practice. How about writing a patch for my patch and
doing some measurements? I'll happily review it :)
Or do you already have some analysis that would support this from when
you changed the Packer code?
-Andrew.
| John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Andrew Bennetts wrote:
> John A Meinel wrote:
> [...]
>> Thanks for doing this. I did the work in Packer, but it is nice to have
>> it here.
>
> Ah! I thought we had something like this. I'll see what code and ideas
> I can reuse.
>
>> I would tend to set "auto_reorder" to always on. I don't see it ever
>> making things worse, and there should only be a small overhead for the
>> reordering. As such, I would also get rid of the constructor flag.
>
> Hmm, ok. If I was just targetting bzr.dev I would have done this
> already, but if you're confident it won't cause unexpected issues I'm
> happy to make it unconditional (including in backports).
>
>> Doing it by names is ok, I think Index already has _name for most
>> indices. You could also order by "access_tuple()" which I believe is a
>> public API and returns (transport, name) to the original .pack file. The
>> *nice* thing about doing that is that you don't have to introduce any
>> new apis here. But if it isn't as clean, then don't worry too much about it.
>
> _name is really a path for _transport, and as such includes the suffix.
> So I don't think it's as clean either code-wise or conceptually to rely
> on that. And as you noticed later access_tuple is on AggregateIndex, so
> it's not readily available either.
>
>> If it isn't too much overhead, you might consider:
>>
>> keys_remove = keys.remove
>> # Be careful, if keys starts empty, we would break, and get a name error
>> index_hit_count = 0
>> for index in self._indices:
>> if not keys:
>> break
>> index_hit_count = 0
>> for node in index.iter_
>> keys_remove(
>> yield node
>> index_hit_count += 1
>>
>> if index_hit_count:
>> hit_indices.
>>
>> hit_indices.
>>
>> self._move_
>>
>>
>> The nice thing about it, is that it is adaptive. In that if you start a
>> search, and hit it in index 1, then you keep looking there, but if the
>> keys start showing up in index 2, then you'll switch. The only other bit
>> of logic that I could think of is:
>
> I don't quite follow you here. Wouldn't my logic switch to preferring
> index 2 anyway?
>
> It's not clear to me that ordering the hit_indices by hit count is
> better.
>
The point is that if the first index in the list gets 1 key hit, but the
second gets 10, and the third gets 5, then it will get sorted into
(index2, index3, index1) rather than (index1, index2, index3). At least,
what I saw with your code was that it is just 'preserve existing order,
but move the ones hit to the front'. Put another way, say the hits are:
5 index1
10 index2
0 index3
4 index5
0 index6
your version would sort to
index1
index2
index5
index3
index6
mine would sort to
index2
index1
index5
index3
index6
I think the latter is slightly preferred. I don't have any numbers to
back that up, just a feeling. (Most likely in this case index2 is a
'bigger' index, and thus has more to *be* hit.)
As for small indices getting rejected quickly, it is true they won't
encounter any I/O overhead, which may be sufficient. It is a function
call, but th...
| Andrew Bennetts (spiv) wrote : | # |
John A Meinel wrote:
[...]
> I think the latter is slightly preferred. I don't have any numbers to
> back that up, just a feeling. (Most likely in this case index2 is a
> 'bigger' index, and thus has more to *be* hit.)
I agree it probably helps, a little, but I'm not going to do it without
some measurements to confirm it does, because of the risk of
counter-intuitive results to untested optimisations. I know you've done
more exploring in this area so I trust your intuititions a bit more than
mine, but I'm still reluctant as the bang for buck seems low.
Also, even my simple logic will tend to keep the more hit indices closer
to the front than others after more and more queries, just not quite as
quickly as it might with hit counts. And this way we don't need to
worry about whether we should be keeping and aging hit counts between
queries, etc.
> As for small indices getting rejected quickly, it is true they won't
> encounter any I/O overhead, which may be sufficient. It is a function
> call, but that is probably tiny overhead, and they'll get sorted late
> quickly anyway.
Right. (FWIW, it's I/O overhead I'm most worried about.)
| Andrew Bennetts (spiv) wrote : | # |
I think all review comments have been addressed, but I'm sure John will tell me if I've missed something :)
| Martin Packman (gz) wrote : | # |
Don't know the code well enough to do a thorough review, but looks sane to me, tests pass, and it'll save me bandwidth.
| Robert Collins (lifeless) wrote : | # |
Andrew, when you and John are happy, I think you should just land this.
On the 'always reorder' side; if you do do that please check that the end of transaction stuff that pops off the in-progress indices won't be affected, back in the dim past it used [0] to 'know' that the first index was the one being written.
| Andrew Bennetts (spiv) wrote : | # |
Robert: Thanks for suggesting that. I've re-read a bunch of code and grepped, and as far as I can see, nothing is cheating by assuming [0] is the in-progress index that is being written. The test suite seems to pass too :)
So no changes needed there. I'm going to land this now before it goes stale.
- 5095. By Andrew Bennetts on 2010-04-08
-
Merge lp:bzr.
- 5096. By Andrew Bennetts on 2010-04-08
-
Add NEWS entry.
| John A Meinel (jameinel) wrote : | # |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Andrew Bennetts wrote:
> Andrew Bennetts has proposed merging lp:~spiv/bzr/smarter-index-search into lp:bzr.
>
> Requested reviews:
> John A Meinel (jameinel)
>
>
> Optimise index lookups in repositories with many pack files.
>
> First, the headline: this greatly improves "bzr pull" of one new revision of grub from savannah by HTTP (as reported on the mailing list, which has further analysis):
This doesn't seem to be as cheap as I expected it to be. I'm not 100%
sure who is at fault here, but I'm doing some profiling of loggerhead +
bzr-history-db, and I came across this:
25765 3.4509 1.8229 bzrlib.
+489715 0.8904 0.8904 +bzrlib.
+541075 0.6721 0.6721 +<method 'append' of 'list' objects>
+25765 0.0654 0.0654 +<zip>
489809 0.8906 0.8906 bzrlib.
590334 0.7374 0.7374 <method 'append' of 'list' objects>
153 0.2031 0.2031 <method 'write' of 'file' objects>
20612 3.0278 0.1986 bzrlib.
+20612 2.7483 1.4505 +bzrlib.
I'm guessing the issue is something walking over the ancestry (like bzr log)
5153 3.8017 0.0712 bzrlib.
Note that only 5k calls to _move_to_front gets increased to 25k calls to
_move_to_
rix/iix/
However, we must be missing something for move to be that expensive.
I'm guessing this is:
https:/
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://
iEYEARECAAYFAkv
k6gAniipAYSMpW9
=RE1+
-----END PGP SIGNATURE-----

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Andrew Bennetts wrote: inventories/ etc CombinedGraphIndex objects belonging to a single pack collection via setting a _sibling_indices attribute on them, c) using those links and names to apply the same reordering to those sibling indices.
> Andrew Bennetts has proposed merging lp:~spiv/bzr/smarter-index-search into lp:bzr.
>
> Requested reviews:
> bzr-core (bzr-core)
>
>
> Optimise index lookups in repositories with many pack files.
>
> First, the headline: this greatly improves "bzr pull" of one new revision of grub from savannah by HTTP (as reported on the mailing list, which has further analysis):
>
> bzr.dev: 2424kB (50.2kB/s r:2395kB w:30kB)
> this patch: 1034kB (43.3kB/s r:1022kB w:12kB)
>
> Given that the pack data transferred is 701266 bytes (which itself seems quite large for such a small change...), that brings the index-searching overhead from 2.42x to 0.45x of the bytes read. It also halves the wall-clock time :)
>
> That repo has I think 14 packs, and bzr.dev tries 11 indices for each of rix, iix, etc before finding the data it needs for that fetch.
>
> There are two parts to this change:
>
> 1) when a CombinedGraphIndex performs a lookup, it shuffles the index or indices that contained the records to the front of self._indices on the assumption that future lookups should try those first.
> 2) propagates that reordering to the other CombinedGraphIndex objects from the same pack collection. This is done by a) associating a name (the pack name) with the elements of CombinedGraphIndex, and b) linking the revisions/
>
> I've been pretty conservative with API changes: the new behaviour is only activated by optional keyword arguments, so existing uses of CombinedGraphIndex should see no change of behaviour (including no improvement). This is to make it as easy as possible to backport this change to 2.1 and 2.0 if we choose to.
>
> I think this change needs some tests before it's truly ready to merge, but it's getting to the end of my work week and I think this code is ready for feedback, so here it is!
>
Thanks for doing this. I did the work in Packer, but it is nice to have
it here.
I would tend to set "auto_reorder" to always on. I don't see it ever
making things worse, and there should only be a small overhead for the
reordering. As such, I would also get rid of the constructor flag.
Doing it by names is ok, I think Index already has _name for most
indices. You could also order by "access_tuple()" which I believe is a
public API and returns (transport, name) to the original .pack file. The
*nice* thing about doing that is that you don't have to introduce any
new apis here. But if it isn't as clean, then don't worry too much about it.
It also avoids having to rebuild the mapping for every move call: _index_ names, self._indices) append( name)
+ indices_info = zip(self.
+ hit_indices_info = []
+ hit_names = []
+ unhit_indices_info = []
+ for name, idx in indices_info:
+ if idx in hit_indices:
+ info = hit_indices_info
+ hit_names.
+ else:
+ info = unhit_indices_info
+ ...