Code review comment for lp:~jameinel/bzr/jam-gdfo-heads

Revision history for this message
John A Meinel (jameinel) wrote :

So this is my followup to Vincent's new algorithm for KnownGraph.heads() The only real changes are to tweak the Pyrex code to use direct access.

There are some wins by changing to using direct access to members, and the big win is to change the algorithm to avoid calling list.pop(). Instead we just track what item we are on and insert an item there. If we find we need more space then we append. (Note there is a specific edge case when you are popping short lists as it reallocs() whenever the size shrinks to 1/2 the max size. However if you only have 2 items, that is *every* pop, and then every append following it will trigger another realloc.)

Also, this changes away from using a 'seen' dictionary to using a .seen attribute.

Here are my times:

            annotate NEWS time_graph (all 7k)
bzr.dev 11.6s 71.4s
vila pyx 11.5s 1.102s
jam pyx 9.7s 0.362s

So while Vincent's update is godlike for whole-ancestry heads() calls, it doesn't show much of an effect for NEWS. I'm pretty happy to find that with a bit of tweaking the new algorithm still gives another 10% of total annotate performance (note that 'resolve right' is only 8.6s, so we are only spending ~1s resolving something like 200k heads now.)

The main diff between my old heads() and Vincent's, is that I was effectively doing a 'find_unique_ancestors' walk, while he realized that you don't have to walk any ancestors of the lowest node.

One other bit of interest for the new code. We track how many nodes we have to walk in order to compute heads() (used to be in cleanup, then in seen, and now in cleanup again.)
The bzr entry walking the most nodes is:
    <email address hidden>
    <email address hidden>

(landing the groupcompress code into brisbane-core, IIRC)
We walk 22k revisions (out of 25.4k total...) because the plugin gdfo is 182, while the bzr gdfo is 7150. So something very close to null versus something about as far away as we can get in bzr. The time to compute heads() between the two... 5ms. Yes, that is 0.005s to walk 22k revisions after computing gdfo and other such bits and determine that the two revisions are not ancestors.

I did end up spending some time to tweak the _find_gdfo() code as well. Once you are around 400ms for doing heads() for every merge revision, suddenly the 100ms or so spent in _find_gdfo starts to be a factor :).
I found that computing the tails 'for free' using a set was *not* free, and was, in fact, a lot slower than walking the dict after the fact. The issue is that parent_map is purely random walking, so you end up pushing a lot of nodes into the set that you just remove. In contrast it takes only 40ms to walk 200k nodes in the OOo dict, and find the small handful of tails we will care about.

_find_gdfo is down around 9ms time for bzr.dev. The bulk of the time in __init__ is now in _initialize_nodes which is doing mallocs for all the objects. Removing _find_gdfo is 35.5ms bzr.dev, 407ms OOo, versus 44.4ms, 563ms w/.

« Back to merge proposal