John A Meinel wrote:
> Review: Needs Fixing
> Overall, this is looking pretty good. A few small tweaks here and there, and
> possible concerns. But only one major concern.

That's great news.  Thanks very much for the thorough review.

> You changed the default return value of "iter_inventories()" to return
> 'unordered' results, which means that "revision_trees()" also not returns
> 'unordered' results. Which is fairly serious, and I was able to find more than
> one code path that was relying on the ordering of revision_trees(). So I think
> the idea is sound, but it needs to be exposed in a backwards compatible manner
> by making it a flag that defaults to "as-requested" ordering, and then we fix
> up code paths as we can.
> 
> I don't specifically need to review that change, though. And I don't really
> want to review this patch yet again :). So I'm voting tweak, and you can
> submit if you agree with my findings.

Ouch, yes, that does sound serious.  I'll fix that.

> 119	+ from bzrlib.graph import FrozenHeadsCache
> 120	+ graph = FrozenHeadsCache(graph)
> 121	+ new_roots_stream = _new_root_data_stream(
> 122	+ root_id_order, rev_id_to_root_id, parent_map, self.source, graph)
> 123	+ return [('texts', new_roots_stream)]
> 
> ^- I'm pretty sure if we are using FrozenHeadsCache that we really want to use
> KnownGraph instead. We have a dict 'parent_map' which means we can do much
> more efficient heads() checks since the whole graph is already loaded. This is
> a minor thing we can do later, but it would probably be good not to forget.

That's a good idea, I'll try that.  Perhaps we should just grep the entire code
base for FrozenHeadsCache and replace all uses with KnownGraph...

> ...
> 
> 392	- elif last_modified[-1] == ':':
> 393	- raise errors.BzrError('special revisionid found: %r' % line)
> 394	- if not delta_tree_references and content.startswith('tree\x00'):
> 395	+ elif newpath_utf8 != 'None' and last_modified[-1] == ':':
> 396	+ # Deletes have a last_modified of null:, but otherwise special
> 397	+ # revision ids should not occur.
> 398	+ raise errors.BzrError('special revisionid found: %r' % line)
> 399	+ if delta_tree_references is False and content.startswith('tree\x00'):
> 
> ^- What does "newpath_utf8 != 'None'" mean if someone does:
> 
> touch None
> bzr add None
> bzr commit -m "adding None"
> 
> Is this a serialization bug waiting to be exposed?
> 
> I guess not as it seems the paths are always prefixed with "/", right?
> 
> (So a path of None would actually be "newpath_utf8 == '/None'")

That's right.  No bug here.

> However, further down (sorry about the bad indenting):
> 
> 413	 if newpath_utf8 == 'None':
> 414	newpath = None
> 
> ^- here you set "newpath=None" but you *don't* set "newpath_utf8" to None.
[...]
> And then here "newpath_utf8" is passed to _parse_entry.
> 
> Now I realize this is probably caught by "content_tuple[0] == 'deleted'".
> Though it feels a bit icky to rely on "newpath_utf8" in one portion and
> "content_tuple[0]" in another (since they can potentially be out of sync.)

Yes.  There's a bit of ickiness here too that _parse_entry redoes the utf8
decoding.  I'll clean this up.

> I think if we just force an early error with:
> if newpath_utf8 == 'None':
>   newpath = None
>   newpath_utf8 = None
> 
> Then if there is an issue at least it fails rather than succeeding to create a
> new file with the name "None".

The idea I think is that after this section newpath_utf8 and oldpath_utf8 aren't
needed anymore because we've either set newpath and oldpath, or we've raised an
error.  _parse_entry ought to just receive newpath, rather than re-decoding
newpath_utf8, and I've made that change.

I can insert a del newpath_utf8, oldpath_utf8 if you think that would make that
clearer?  Assigning to None is similar but less explicit, and either way it
should be done unconditionally at that point in the loop, not just for the
'None' case.

> if utf8_path.startswith('/'):
> 
> ^- If this is a core routine (something called for every path) then:
> if utf8_path[:1] == '/':
> 
> *is* faster than .startswith() because you 
> 1) Don't have a function call
> 2) Don't have an attribute lookup
> 
> I'm assuming this is a function that gets called a lot. If not, don't worry about it.

It is called for every path, but it's barely even registering in lsprof.  Just
0.40% of _extract_and_insert_inventories is spent in parse_text_bytes.
startswith accounts for 3.5% of that 0.40%, which explains why it was so hard to
find in kcachegrind :)

To be fair, this was a local push, so _stream_invs_as_deltas from the source was
being largely counted under _extract_and_insert_inventories, but it's still a
pretty miniscule amount of time.

I don't think micro-optimisations are going to make a great deal of difference
here, at least not until we remove some other bottlenecks.

(Incidentally, fixing the duplicated utf8 decode of newpath_utf8 will probably
cut ~8% of the time of parse_text_bytes according to kcachegrind, which is
already much larger than the potential benefit of replacing startswith.)

> ...
> 
> 566	+ if required_version < (1, 18):
> 567	+ # Remote side doesn't support inventory deltas. Wrap the stream to
> 568	+ # make sure we don't send any. If the stream contains inventory
> 569	+ # deltas we'll interrupt the smart insert_stream request and
> 570	+ # fallback to VFS.
> 571	+ stream = self._stop_stream_if_inventory_delta(stream)
> 
> ^- it seems a bit of a shame that if we don't support deltas we fall back to
> VFS completely, rather than trying something intermediate (like falling back to
> the original code path of sending full inventory texts, or IDS, or...)
> 
> I think we are probably okay, but this code at least raises a flag. I expect a
> bug report along the lines of "fetching between 1.18 and older server is very
> slow". I haven't looked at all the code paths to determine if 1.18 will have
> regressed against a 1.17 server. Especially when *not* converting formats. Have
> you at least manually tested this?

Fetching *cross-format* over the network is already extremely slow, due to IDS.
This isn't optimal, but should be no worse (and possibly better) than IDS.

If it isn't a cross-format fetch, then there will be no inventory-deltas stream.

So I'm pretty sure there's no regression here.

Also, as Robert points out, the alternatives are much more problematic; the sink
doesn't have any way to rewind or restart the stream it's given.  We can't even
up-cast inventory-deltas to inventory fulltexts because some formats (e.g. 2a in
1.17) don't have a usable inventory fulltext format (because it tries to use
xml5 to deserialise them).

> ...
> 
> 703	+ self.new_pack.finish_content()
> 704	+ if len(self.packs) == 1:
> 705	+ old_pack = self.packs[0]
> 706	+ if old_pack.name == self.new_pack._hash.hexdigest():
> 707	+ # The single old pack was already optimally packed.
> 708	+ self.new_pack.abort()
> 709	+ return None
> 
> ^- I think this is a good fix, but
> 1) It really should have a NEWS entry and a test case [it might have a test],
> as I'm sure there are people following the bug.

I'll add a NEWS entry.  I don't think this fixes the bug report I know of though
(perhaps one case of it?).

To be honest, I'm not sure what the right test to write here is.  I agree it
would be good to have an explicit test for this, but I'm not sure where to add
it, or how to provoke this condition in a succint way (and without grossly
violating the layering).

Tests were failing without this fix, though :)

> 2) It probably should say something about the repository already being
> optimally packed. At a minimum a mutter() but I'm thinking it is a reasonable
> 'note()'.

I'm not confident that it's a good note(), so I'll err on the side of a
mutter().  I'm pretty sure I saw this come up in tests during autopacks after a
fetch, when the user didn't explicitly ask for a pack... ah, here's a relevant
part of the conversation I had with Robert about it:

<lifeless> I suspect the following conditions are occuring:
<lifeless>  - you upload a single pack
<lifeless>  - the pack is sufficiently simple that the sort order for its contents are the same as the upload order happened to generate
<lifeless>  - the group splitting heuristics happen to line up at the same boundaries
<lifeless> -> collision, with the same content.
<lifeless> so this is natural.

So, I don't think we should emit a note() in that circumstance.

> 737	 def _get_source(self, to_format):
[...]
> 751	+ # Actually, this test is just slightly looser than exact so that
> 752	+ # CHK2 <-> 2a transfers will work.
> 
> ^- I think this should probably be ".network_name() == other.network_name()"
> and we just customize the names to be the same. Is that possible to do?

Robert already answered this.

> 854	+ def iter_inventories(self, revision_ids, ordering='unordered'):
> 
> ^- the "iter_inventories" api was written such that it always returned the
> inventories in the *same* order as the supplied revisions, and we've had code
> that did:
> 
> for rev, inv in izip(revs, iter_inventories(revs)):
>     ...
> I've tried to get away from it, because of inv.revision_id, but I think this is
> a significant API break that should be documented.
> 
> That is the original reason *why* _iter_inventory_xmls() always buffered the
> texts before yielding.

...and it still does, although it now it buffers only as is necessary to give
results in order, rather than the full result.  See the “while next_key in
text_chunks” loop, next_key comes from iter(keys), i.e. that loop yields the
results in the same order as they were passed in.  I've added a comment to
clarify.

So, actually, this change isn't having the effect I intended.  So thanks for
highlighting that!  Hmm.

The idea was that _stream_invs_as_deltas wants to generate the deltas in
topological order so that the sink can always insert them without needing to
buffer them.  Given that it was always returning them in as-requested order, I
wonder why that never caused any trouble?  The revisions list passed in there is
ordered in python dict.keys() order, I think!  I've fixed it anyway.

[...]
> However, I really think for *your* patch, we shouldn't change the return
> ordering of "iter_inventories". We could change it as:
> 
> def iter_inventories(self, revisions, ordering=None):
>   if ordering is None or ordering == 'as-requested':
>       # return inventories in the exact order as requested
>       ...

Right.  I've made a change along these lines, although just ordering=None
(having 'as-requested' be a second spelling for None/omitting the arg doesn't
seem much clearer to me).

> 1185	+ elif from_format.network_name() == self.to_format.network_name():
> 1186	+ # Same format.
> 1187	+ return self._get_simple_inventory_stream(revision_ids,
> 1188	+ missing=missing)
> 1189	+ elif (not from_format.supports_chks and not self.to_format.supports_chks
> 1190	+ and from_format._serializer == self.to_format._serializer):
> 1191	+ # Essentially the same format.
> 1192	+ return self._get_simple_inventory_stream(revision_ids,
> 1193	+ missing=missing)
> 
> ^- would the second "elif" be better as just an "or ..." on the first elif?

Personally I find multi-line if conditions to be pretty hard to untangle when
reading.  So I picked this as the lesser evil in this instance.  I'm sure
individual tastes vary, though.

> 1434	+ def _should_fake_unknown(self):
> 1435	+ # This is a workaround for bugs in pre-1.18 clients that claim to
> 
> ^- This seems interesting, is it well tested that this is triggered? (And did
> you manually use an older bzr and test it with a newer server?)

I'm pretty sure I tested this manually.  I'll double-check.  I'll also add
something to test_smart.

> 
> 1660	- result.append((InterRepository,
> 1661	- weaverepo.RepositoryFormat5(),
> 1662	- knitrepo.RepositoryFormatKnit3()))
[...]
> 1691	+ add_combo(knitrepo.RepositoryFormatKnit1(),
> 1692	+ pack_repo.RepositoryFormatKnitPack1())
> 
> 
> ^- so aside from some obvious duplication that needed to be removed, this looks
> like it removes some test permutations.
> 
> Namely, testing that both "InterRepository" *and* "InterKnitRepository" work
> between certain repository formats.
> 
> I'm a little concerned that we are opening potential holes here. What do you
> think?

Hah.  It *looks* like that, but actually nothing was using the
interrepository_class at all (except a test for the test paramaterisation
helpers!).  So actually I've not only removed duplication, I've removed the
impression that there's more coverage here than there actually is.

> I can't say that I did a fully thorough review of all the tests. my primary
> concern is the change to "iter_inventories" which breaks a bunch of assumptions
> from higher up code.

Thanks very much for the review.

-Andrew.