Code review comment for lp:~spiv/bzr/faster-stacked-tree-build

Revision history for this message
Martin Pool (mbp) wrote :

+ # XXX: this is a gross hack.
+ mutter('HACK adding texts-for-rev:%s', tip)

[tweak] Can you add something more about what's specifically wrong with it or what ought to be done instead (even just a pointer to bug 791704), otherwise for someone who comes to look at it it's just kind of rubbing it in.

I presume what you're saying is gross is the special case for when this fires? Or is it that it is done as a special search at all? It does look like it will cause a similar performance cliff to what we currently have when you branch into a new empty repository as opposed to creating a new repository.

[optional] Thinking this through makes me wonder if we should have a debug option to force this on or force it off, so that we can eg use it as a workaround and can more easily compare the performance.

+# def get_full_keys(self):
+# return set(sr.get_full_keys() for sr in self.search_results)

[query] If this is really dead can you just remove it? Likewise below.

+ def __init__(self, revision_id, repo=None):
+ """Constructor.
+
+ :param revision_id: A revision ID. All records needed to create a
+ checkout of this revision are referenced by this result (i.e. the
+ revision record and corresponding inventory record, and all
+ referenced texts and chk_bytes).
+ :param repo: An optional repository. If not given, then
+ ``get_full_keys`` cannot be used.
+ """
+ self.revision_id = revision_id
+# self.repo = repo

[query] Is the repo really unused? Maybe the function prototype or docstring needs to be changed?

+ if self.texts_for_inv is not None:
+ # Emit all CHK records for this inventory, and add all texts
+ # those nodes reference.
+ keys = set()
+ for node in self.texts_for_inv.id_to_entry.iter_nodes():
+ keys.add(node.key())
+ if type(node) is chk_map.LeafNode:
+ self._text_keys.update(
+ [chk_map._bytes_to_text_key(b) for b in
+ node._items.values()])
+ stream = self.from_repository.chk_bytes.get_record_stream(
+ keys, 'unordered', True)
+ for record in stream:
+ yield record
+ uninteresting_root_keys.add(
+ self.texts_for_inv.id_to_entry.key())

[comment] This seems to be getting kind of large for a nested function but perhaps refactoring it is out of scope.

[query] It's not obvious to me that there are tests for this actually being reached, but I might be just missing that.

I'd like to get a second review, probably from John, and someone should also build this and just play with it to check the behaviour is reasonable. This needs to be considered as a risk factor for the landing on lp (maybe we should have somewhere to accumulate a list of them.)

« Back to merge proposal