> Thanks for doing this Jeroen, it looks pretty good so far. I've just got a > few nits inline: Ah, one of those reviews that actually make my code better. Good! > > === modified file 'lib/lp/archivepublisher/domination.py' > > --- lib/lp/archivepublisher/domination.py 2011-11-02 12:58:31 +0000 > > +++ lib/lp/archivepublisher/domination.py 2011-11-03 10:12:17 +0000 > > @@ -192,6 +197,104 @@ > > else: > > return version_comparison > > > > + def sortPublications(self, publications): > > + """Sort publications from most to least current versions.""" > > + # Listify; we want to iterate this twice, which won't do for a > > + # non-persistent sequence. > > + sorted_publications = list(publications) > > This variable is prematurely named, I think I'd just write: > publications = list(publications) > and return publications.sort(...) later. Yup, you're right. Done. (Apart from one nasty little subtlety: surprisingly, list.sort does not return the sorted list. I suppose that's to avoid obscuring the fact that it modifies the list in-place.) > > +def get_binary_versions(binary_publications): > > + """List versions for sequence of `BinaryPackagePublishingHistory`.""" > > + return [pub.binarypackagerelease.version for pub in > binary_publications] > > If you're going to do :param: and :return: for the other functions, you > should do it here too! Hoist by my own pedantard. Done. > > +def find_live_binary_versions_pass_1(publications): > > + """Find versions out of Published `publications` that should stay live. > > + > > + This particular notion of liveness applies to first-pass binary > > + domination: the latest version stays live, and so do publications of > > + binary packages for the "all" architecture. > > + > > + :param publications: An iterable of `BinaryPackagePublishingHistory`, > > + sorted by descending package version. > > + :return: A list of live versions. > > + """ > > + publications = list(publications) > > + latest = publications.pop(0) > > + return get_binary_versions( > > + [latest] + [ > > + pub for pub in publications if not pub.architecture_specific]) > > pub.architecture_specific masks a query, in case you didn't know. I > absolutely detest properties that do that, it encourages SQL death-by-a- > thousand-cuts. I spotted it through sheer luck. It's just dereferencing a Storm reference though, and I took care to bulk-load the referenced BinaryPackageReleases. > > +def find_live_binary_versions_pass_2(publications): > > + """Find versions out of Published `publications` that should stay live. > > + > > + This particular notion of liveness applies to second-pass binary > > + domination: the latest version stays live, and architecture-specific > > + publications stay live (i.e, ones that are not for the "all" > > + architecture). > > + > > + More importantly, any publication for binary packages of the "all" > > + architecture stay live if any of the non-"all" binary packages from > > + the same source package release are still active -- even if they are > > + for other architectures. > > + > > + This is the raison d'etre for the two-pass binary domination algorithm: > > + to let us see which architecture-independent binary publications can be > > + superseded without rendering any architecture-specific binaries from > the > > + same source package release uninstallable. > > + > > + (Note that here, "active" includes Published publications but also > > + Pending ones. This is standard nomenclature in Soyuz. Some of the > > + domination code confuses matters by using the term "active" to mean > only > > + Published publications). > > + > > + :param publications: An iterable of `BinaryPackagePublishingHistory`, > > + sorted by descending package version. > > + :return: A list of live versions. > > Thanks for the detailed docstring. I'd only add that "all" means not > architecture_specific otherwise non-soyuz people reading this will get > confused. This is something I feared: it is mentioned actually, right in the second paragraph, but there's probably just too much in here to make it come out clearly. Since I'll be doing more work on this code anyway, there is still time to suggest better wordings. > > + publications = list(publications) > > + latest = publications.pop(0) > > + is_arch_specific = attrgetter('architecture_specific') > > + arch_specific_pubs = filter(is_arch_specific, publications) > > + arch_indep_pubs = filter( > > + lambda pub: not is_arch_specific(pub), > > + publications) > > Out of interest, why didn't you use a list comprehension? I thought map and > filter were deprecated these days. That's what I keep thinking too. But the python documentation says otherwise. Perhaps it was some other massively useful function that got deprecated, or perhaps people changed their minds. Anyway, looking this up also made me find ifilter and ifilterfalse (in itertools). With those, I can finally say what I need here tersely and uniformly: arch_specific_pubs = list(ifilter(is_arch_specific, publications)) arch_indep_pubs = list(ifilterfalse(is_arch_specific, publications)) > > + # XXX JeroenVermeulen 2011-11-01 bug=884649: This is likely to be > > + # costly, and the result could be reused for all builds of the same > > + # source package release to all architectures. You didn't comment but I saw it in your eyes: yes, this is what bug-884649-branch-4 will address. > > + # Verify that the publications are really sorted properly. > > + check_order = OrderingCheck(cmp=generalization.compare, > reverse=True) > > Do you really need this? Isn't it just overhead? > > (I realise it's belt & braces to prevent coding snafus, but this is a > performance-critical area) It's not very costly compared to inevitable work we're already doing: for a list of n publications, this does n-1 version comparisons. That's guaranteed to be no more than the sort operation that produced the publications list in the first place. The loop depends very heavily on the sort order that this verifies, and there are several call sites that do the sorting. It'd be nice if we could just re-sort to make sure of the right order, but besides being more expensive that also wouldn't help the liveness functions which also rely on sort order. Given how important it is to get this right and how easy it is to get it wrong, I think verifying the order is worth the small added cost. > > @@ -550,7 +616,12 @@ > > ) > > > > def findSourcesForDomination(self, distroseries, pocket): > > - """Find binary publications that need dominating.""" > > + """Find binary publications that need dominating. > > + > > + This is only for traditional domination, where the latest published > > + publication is always kept published. It will ignore publications > > + that have no other publications competing for the same binary > package. > > + """ > > This docstring doesn't make sense when read in isolation. You might want to > mention the arch-all thing or just reference the explanation in the docstring > for find_live_binary_versions_pass_2(). Done. > > @@ -589,11 +660,18 @@ > > distroseries.name, pocket.title) > > > > generalization = GeneralizedPublication(is_source=True) > > + > > + self.logger.debug("Finding sources...") > > sources = self.findSourcesForDomination(distroseries, pocket) > > + sorted_packages = self._sortPackages(sources, generalization) > > > > self.logger.debug("Dominating sources...") > > - self._dominatePublications( > > - self._sortPackages(sources, generalization), generalization) > > + for name, pubs in sorted_packages.iteritems(): > > + self.logger.debug("Dominating %s" % name) > > + assert len(pubs) > 0, "Dominating zero binaries!" > > This is for sources! Whoopsie. Fixed. > > === modified file 'lib/lp/archivepublisher/tests/test_dominator.py' > > --- lib/lp/archivepublisher/tests/test_dominator.py 2011-11-02 10:28:31 > +0000 > > +++ lib/lp/archivepublisher/tests/test_dominator.py 2011-11-03 10:12:17 > +0000 > > @@ -358,6 +372,16 @@ > > SeriesStatus.OBSOLETE) > > > > > > +def remove_security_proxies(proxied_objects): > > + """Return list of `proxied_objects`, without their proxies. > > + > > + The dominator runs only in scripts, where security proxies don't get > > + in the way. To test realistically for this environment, strip the > > + proxies wherever necessary and do as you will. > > + """ > > + return [removeSecurityProxy(obj) for obj in proxied_objects] > > > This tells me that the script is running in the wrong layer perhaps? > I've never had to do this in a Zopeless test - there is no proxy to remove. > > > +def make_publications_arch_specific(pubs, arch_specific=True): > > + """Set the `architecturespecific` attribute for given SPPHs. > > + > > + :param pubs: An iterable of `BinaryPackagePublishingHistory`. > > + :param arch_specific: Whether the binary package releases published > > + by `pubs` are to be architecture-specific. If not, they will be > > + treated as being for the "all" architecture. > > + """ > > + for pub in pubs: > > + bpr = removeSecurityProxy(pub).binarypackagerelease > > + bpr.architecturespecific = arch_specific > > + > > + > > +class TestLivenessFunctions(TestCaseWithFactory): > > + """Tests for the functions that say which versions are live.""" > > + > > + layer = ZopelessDatabaseLayer > > Hmmm. Something is weird then. Ah, it's the set_attribute in the zcml > for BPR of course.... I was also not allowed to call getOtherPublicationsForSameSource. > > + def test_find_live_binary_versions_pass_1_blesses_arch_all(self): > > + versions = list(reversed(['1.%d' % version for version in > range(3)])) > > + bpphs = make_bpphs_for_versions(self.factory, versions) > > + > > + # All of these publications are architecture-specific, except > > + # the last one. This would happen if the binary package had > > + # just changed from being architecture-specific to being > > + # architecture-independent. > > Did you consider the case where arch-indep turns into arch-specific? Yes. It's the same story though, so not really worth testing separately. The loops in the liveness functions do not carry state; they're only interested in ordering in that they accept the latest version as live. > > + def test_find_live_binary_versions_pass_2_blesses_latest(self): > > + bpphs = make_bpphs_for_versions(self.factory, ['1.2', '1.1', > '1.0']) > > + make_publications_arch_specific(bpphs, False) > > + self.assertEqual(['1.2'], find_live_binary_versions_pass_2(bpphs)) > > + > > + def test_find_live_binary_versions_pass_2_blesses_arch_specific(self): > > + versions = list(reversed(['1.%d' % version for version in > range(3)])) > > + bpphs = make_bpphs_for_versions(self.factory, versions) > > + make_publications_arch_specific(bpphs) > > + self.assertEqual(versions, find_live_binary_versions_pass_2(bpphs)) > > Can you comment what "blesses" means in the last two tests - you'll thank > me in a year's time when you come back to this Done. But, er... does this mean I'm condemned to another year of this? :-) Jeroen