-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Andrew Bennetts wrote: > Andrew Bennetts has proposed merging lp:~spiv/bzr/inventory-delta into lp:bzr. > > Requested reviews: > bzr-core (bzr-core) > > This is a pretty big patch. It does lots of things: > > * adds new insert_stream and get_stream verbs > * adds de/serialization of inventory-delta records on the network > * fixes rich-root generation in StreamSource > * adds a bunch of new scenarios to per_interrepository tests > * fixes some 'pack already exist' bugs for packing a single GC pack (i.e. when > the new pack is already optimal). > * improves the inventory_delta module a little > * various miscellaneous fixes and new tests that are hopefully self-evident > * and, most controversially, removes InterDifferingSerializer. > >>From John's mail a while back there were a bunch of issues with removing IDS. I > think the outstanding ones are: So for starters, let me mention what I found wrt performance: time bzr.dev branch mysql-1k myqsl-2a/1k real 3m18.490s time bzr.dev+xml8 branch mysql-1k myqsl-2a/1k real 2m29.953s +xml8 is just this patch: === modified file 'bzrlib/xml8.py' - --- bzrlib/xml8.py 2009-07-07 04:32:13 +0000 +++ bzrlib/xml8.py 2009-07-16 16:14:38 +0000 @@ -433,9 +433,9 @@ pass else: # Only copying directory entries drops us 2.85s => 2.35s - - # if cached_ie.kind == 'directory': - - # return cached_ie.copy() - - # return cached_ie + if cached_ie.kind == 'directory': + return cached_ie.copy() + return cached_ie return cached_ie.copy() kind = elt.tag It has 2 basic effects: 1) Avoid copying all inventory entries all the time (so reduce the time spent in InventoryEntry.copy()) 2) By re-using exact objects "_make_delta" can do "x is y" comparisons, rather than having to do: x.attribute1 == y.attribute1 and x.attribute2 == y.attribute2 etc. As you can see it is a big win for this test case (about 4:3 or 33% faster) So what about Andrew's work: time bzr.inv.delta branch mysql-1k myqsl-2a/1k real 10m14.267s time bzr.inv.delta+xml8 branch mysql-1k myqsl-2a/1k real 9m49.372s It also was stuck at: [##################- ] Fetching revisions:Inserting stream:Walking content 912/1043 For most of that time, making it really look like it was stalled. Anyway, this isn't something where it is, say, 10% slower which is acceptable because we get rid of some extra code paths. This ends up being 3-4x slower and no longer giving any progress information. If that scales to launchpad sized projects, you are talking 4-days becoming 16-days (aka > 2 weeks). So honestly, I don't think we can land this as is. I won't stick on the performance side if people feel it is acceptable. But I did spend a lot of time optimizing IDS that clearly hasn't been done with StreamSource. John =:-> -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Cygwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkpfWm0ACgkQJdeBCYSNAAM8mgCgru3K3SpP8BcMZdLJLHkqSSTQ TlAAoLVd7ndCbPeHNl3Kxbu0clMKLnR7 =BB8Q -----END PGP SIGNATURE-----