Code review comment for lp:~abentley/bzr/fix_get_mtime

Revision history for this message
Aaron Bentley (abentley) wrote :

> Creating a path<->id map is expensive on large trees;

Generally, this price will have already been paid.

trans_id_tree_file_id and trans_id_file_id will hit id2path, so a lot of the client code that constructs a TreeTransform will hit id2path.

Even if id2path isn't used in the construction of the TreeTransform, trans_id_file_id is heavily used in the implementation of PreviewTree, so using the tree will generally hit id2path.

I don't understand why there needs to be such a map. Can we not retrieve only the data we care about? That would scale with the amount of data we're actually showing, which I think is an acceptable scaling factor.

> I've found significant
> benefits in commit etc when enough layers are pushed through to avoid this;
> not passing in the path will cause that map to be created. My understanding
> from Martin is that we want to be heading across the board to proving path
> data as much as possible - to avoid unnecessary work.

I wouldn't dismiss these benefits, but I don't know that they apply here.

This is a codepath that you will hit only when querying the mtime of an unmodified file. Our diff code doesn't do that; only qbzr's does.

> I'm not sure why its complete for TreeTransform to calculate the path, surely
> its just a walk of transaction parents ?

If the transaction does not create or rename the parents, there's no guarantee we have their names, much less their paths.

Aaron

« Back to merge proposal